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^ (54) Title: MODIFIED RECOMBINASE 

^2 (57) Abstract: The present invention concerns a fusion protein comprising a recombinase protein, preferably the site-specific DNA 
recombinase C31-Int of phage (C31, and a peptide sequence which directs the nuclear uptake of the fusion protein in eucaryotic 
cells, and the use of this fusion protein to recombine, invert or delete DNA molecules containing recognition sequences for said 

2! recombinase in eucaryotic cells at high efficiency* Li addition the invention relates to a cell, preferably a mammalian cell which 
contains recognition sequences for said recombinase in its genome and wherein the genome is recombined by the action of said 
fusion protein. Moreover, the invention relates to the use of said cell to study the function of genes and for the creation of transgenic 

£^ organisms to study gene function at various developmental stages, including the adult. In conclusion, the present invention provides 

1^ a process which enables the highly efficient modification of the genome of mammalian cells by site- specific recombination. 
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Modified Recombinase 

5 The present Invention concerns a fusion protein comprising a recombinase 
protein, preferably the site-specific DNA recombinase C31-Int of phage <£C31, 
and a peptide sequence which directs the nuclear uptake of the fusion protein in 
eucaryotic cells, and the use of this fusion protein to recombine, invert or delete 
DNA molecules containing recognition sequences for said recombinase In 

10 eucaryotic cells at high efficiency. In addition the invention relates to a cell, 
preferably a mammalian cell which contains recognition sequences for said 
recombinase In its genome and wherein the genome is recombined by the action 
of said fusion protein- Moreover, the invention relates to the use of said cell to 
study the function of genes and for the creation of transgenic organisms to study 

15 gene function at various developmental stages, including the adult. In conclusion, 
the present invention provides a process which enables the highly efficient 
modification of the genome of mammalian cells by site-specific recombination. 

20 Background of the invention 

The controlled and permanent modification of the genome of eucaryotic cells and 
organisms is an important method for research applications, e.g. for studying 
gene function, for medical applications like gene therapy and the creation of 
disease models and for the design of economically important animals and crops. 

25 The basic methods for genome manipulations by the engineering of endogenous 
genes through gene targeting in murine embryonic stem (ES) cells are well 
established and used since many years (Capecchi, Trends in Genetics, 5, 70-76 
(1989)). Since ES cells can pass mutations Induced In vitro to transgenic 
offspring in vivo it is possible to analyse the consequences of gene disruption in 

30 the context of the entire organism. Thus, numerous mouse strains with 
functionally inactivated genes ("knock-out mice") have been created by this 
technology and utilised to study the biological function of a variety of genes 
(Kolfer et al., Ann. Rev. Immunol,, 10, 705 - 730 (1992)). More importantiy, 
mouse mutants created by this procedure (also known as "conventional, 

35 complete or classical mutants"), contain the inactivated gene in all cells and 
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tissues throughout life- Thus, classical mouse mutants represent the best animal 
model for inherited human diseases as the mutation is introduced into the 
germJine but are not the optimal model to study gene function in adults, e.g. to 
validate potential drug target genes. 
5 A refined method of targeted mutagenesis, referred to as conditional 
mutagenesis, employs the Cre/loxP site-specific recombination system which 
enables the temporally and/or spatially restricted inactivation of target genes in 
cells or mice (Rajewsky et al., j. Clin. Invest, 98, 600 - 603 (1996)). The phage 
PI derived Cre recombinase recognises a 34 bp sequence referred to as loxP site 

10 which is structured as an inverted repeat of 13 bp separated by an asymmetric 8 
bp sequence which defines the direction of the loxP site. If two loxP sites are 
located on a DNA molecule in the same orientation the intervening DNA sequence 
is excised by Cre recombinase from the parental molecule as a closed circle 
leaving one loxP site on each of the reaction products (Kilby et al., TIG, 9, 413- 

15 421 (1993)). The creation of conditional mouse mutants initially requires the 
generation of two mouse strains, one containing two or more Cre recombinase 
recognition (loxP) sites in its genome while the other harbours a Cre transgene. 
The former strain is generated by homologous recombination in ES cells as 
described above, except that the exon(s) of the target gene is (are) flanked by 

20 two loxP sites which reside in introns and do not interfere with gene expression. 
The Cre transgenic strain contains a transgene whose expression is either 
constitutively active in certain cells and tissues or is inducible by external agents, 
depending on the promoter region used, Crossing of the ioxP-flanked mouse 
strain with the Cre recombinase expressing strain enables the deletion of the 

25 loxp-fianked exons in the genome of doubly transgenic offspring in a prespecified 
temporally and/or spatially restricted manner. Thus, the method allows the 
analysis of gene function in particular cell types and tissues of otherwise widely 
expressed genes. Moreover, it enables the analysis of gene function in the adult 
organism by circumventing embryonic lethality which is often the consequence of 

30 complete (germline) gene inactivation. For pharmaceutical research, aiming to 
validate the utility of genes and their products for drug development, gene 
. inactivation which is inducible in adults provides an excellent genetic tool as this 
mimicks the biological effects of target inhibition upon drug application. 
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Since the first description of the concept of conditfona! gene targeting using the 
Cre/loxP system in mice in 1994 (Gu et al., Science 265, 103-106 (1994)) this 
method became increasingly popular among the research community and 
resulted in a broad collection of genetic tools for biological research in the mouse. 
5 More than 30 Cre transgenic mouse strains with various tissue specificities for 
gene inactivation have been created, including several "deleter" strains which 
allow to remove the loxP-flanked target gene segment in the male or female 
germline (Cohen-Tannoudji et al., Mol. Hum. Reprod. 4, 929-938 (1998); 
Metzger et al., Curr. Op. Biotech., 10, 470-476 (1999)). The need to characterise 

10 the expression pattern of Cre mediated recombination in newly generated strains 
stimulated the construction of a number of "Cre-reporter" strains which harbour 
a silent reporter gene the expression of which is activated upon Cre-mediated 
deletion (Nagy, Genesis, 26, 99-109 (2000))- Conditional mouse mutants have 
been reported for about 20 different genes, many of them could not be studied in 

15 adults as their complete inactivation leads to embryonic lethality (Cohen- 
Tannoudji et ah, MoL Hum. Reprod, 4, 929-938 (1998)). 

Great efforts have also been made to control the expression of Cre recombinase 
in an inducible fashion in mice. After the first demonstration that inducible gene 

20 knock-out is feasible in adult mice using an interferon controlled promoter (Kuhn 
et al., Science, 269, 1427-1429 (1995)), mainly two methods were applied to 
control the activity of Cre recombinase. First, it has been demonstrated that the 
fusion of Cre with the ligand binding domain of a mutant estrogen receptor allows 
to control recombinase activity by a specific steroid-like inducer. Several 

25 transgenic mouse strains expressing such a fusion protein have been generated 
and allow to induce gene inactivation in specific tissues (Metzger et al,, Curr, Op. 
Biotech., 10, 470-476 (1999)). Furthermore, the tetracycline- regulated gene 
expression system has been successfully used to control the expression of Cre in 
transgenic mice and thus provides a second system for inducible gene 

30 inactivation using doxycycline as inducer (Saam et al., J. Biol. Cbem. 274, 
38071-38082 (1999)). 

In addition to the application of Cre/loxP for gene inactivation by deletion of a 
gene segment this recombination system has been proved to be useful also for a 
35 number of other genomic manipulations in ES cells or mice. These include the 
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conditional activation of transgenes in mice, chromosome engineering to obtain 
deletion, translocation or inversion, the simple removal of selection marker 
genes, gene replacement, the targeted insertion of transgenes and the 
(in)activation of genes by inversion (Nagy, Genesis, 26, 99-109 (2000); Cohen- 
5 Tannoudji et al., Mol. Hum, Reprod. 4, 929-938 (1998)), In conclusion, the 
Cre/loxP recombination system has been proven to be extremely useful for the 
analysis of gene function in mice by broadening the methodological spectrum for 
genome engineering. It can be expected that many of the protocols now 
established for the mouse may be applied in future also to other animals or 
10 plants. 

In contrast to the huge diversity of genome manipulations which have been 
developed for the Cre/loxP system, very limited efforts have been made to 
develop further site-specific recombination systems for the use in mammalian 

15 cells. Alternative recombination systems of different specificity but with an 
efficiency comparable to Cre/loxP could further enhance the flexibility of genome 
engineering by the side to side use of two or more systems in the same cell or 
organism- Furthermore, unidirectional recombination systems which follow a 
different mechanism than the reversible Cre/loxP-mediated recombination should 

20 allow to develop new applications for genome engineering which cannot be 
performed with the current systems. 

The reasons for the almost exclusive use of the Cre/loxP system for site-specific 
recombination in mammalian cells are readily explained by a number of 
25 requirements which must be fulfilled for the efficient use of a recombinase in 
mammalian ceils: 

i) the recombinase should act independent of cofactors like helper 
proteins, 

ii) it should act independent of the supercoiling status of the target DNA 
30 and also on mammalian chromatin, 

ill) it should be efficiently active and stable at a temperature of 37°C, and 
iv) it should recognize a target sequence which is sufficiently long to be 
unique among large genomes, and it should exhibit a very high affinity 
to its target site for efficient action (Kilby et al,, TIG, g, 413-421 
35 (1993)). 
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Among the more than 200 described members of the integrase and 
resolvase/invertase recomblnase families only the Cre/loxP system is presently 
known to fulfill all of these requirements (Nunes-DQby et aL, Nucleic Acids Res., 
5 26, 391-406 (1998); Kiiby et aL, TIG, 9, 413-421 (1993); Ringrose et aL, 3. Mol, 
BioL, 284, 363 - 384 (1998)). Besides Cre/loxP a few recombinases have been 
shown to exhibit some activity in mammalian celts but their practical value is 
presently unclear as their efficieny has not been compared to the Cre/loxP 
system on the same genomic recombination substrate and in some cases it is 

10 known that one or more of the criteria listed above are not met. The best 
characterised examples are the yeast derived FLP and Kw recombinases which 
exhibit a temperature optimum at 30°C but which are unstable at 37°C (Buchholz 
et aL, Nature Biotech,, 16, 657 - 662 (1998); Ringrose et aL, Eur. J, Biochem., 
248, 903 - 912). For FLP it has been shown in addition that its affinity to the FRT 

15 target site is much lower as compared to the affinity of Cre to loxP sites 
(Ringrose et aL, J. MoL BioL, 284, 363 - 384 (1998)). Other recombinases which 
show in principle some activity in mammalian cells are a mutant integrase of 
phage \ r the integrases of phages OC31 and HK022, mutant yS-resolvase and p- 
recombinase (Lorbach et aL, J. MoL BioL, 296, 1175 - 81 (2000); Groth et al. # 

20 Proc. Natl. Acad. ScL USA, 97, 5995 - 6000 (2000); Kolot et aL, MoL Biol. Rep. 
26, 207 - 213 (1999); Schwikardi et aL, FEBS Lett,, 471, 147 - 150 (2000); Diaz 
et aL, J. BioL Chem., 274, 6634 - 6640 (1999)). Other phage integrase systems 
include coliphage P4 recomblnase, Listeria phage recombinase, bacteriophage R4 
Sre recombinase, CisA recombinase, XisF recombinase and transposon Tn4451 

25 TnpX recombinase (Stark et aL Trends in Genetics 8, 432-439 (1992); Hatful! & 
Gridley, in Genetic Recombination. Eds. Kucheriipati & Smith, Am, Soc. 
Microbiol., Washington DC, 357-396 (1988)). 

However, the practical value of these recombinases and integrases for use in 
30 mammalian cells is limited as their efficiency to recombine mammalian genomic 
DNA has not been tested or compared with the Cre/loxP system. From the data 
available it can be assumed that these recombinases are much less effective than 
the Cre/loxP system. 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

6 

In a few cases attempts have been made to improve the performance of 
recombinases in mammalian cells: for FLP a mutant showing improved 
thermostability and acticity at 37°C has been isolated but this mutant is still 
considerably more heat labile as compared to Cre (Buchholz et aL, Nature 
5 Biotech., 16, 657 - 662 (1998)). In the case of k-integrase and y5-reso!vase the 
absolute requirement for coproteins and supercoiled DNA could be eliminated by 
the introduction of specific point mutations (Schwikardi et at. FEBS Lett 471, 
ppl47-50 (2000)). 



10 The import of cytoplasmic proteins into the nucleus of eucaryotic cells through 
nuclear pores is a regulated, energy dependent process mediated by specific 
receptors (Gorlich et aL, Science, 271, 1513 - 1518 (1996)), Proteins which do 
not posses a signal sequence recognised by the nuclear import machinery are 
excluded from the nucleus and remain in the cytoplasm. Numerous of such 

15 nuclear localisation signal sequences (NLS), which share a high proportion of 
basic amino acids in common, have been characterised (Boulikas, Crit. Rev. 
Eucar. Gene Expression, 3, 193 - 227 (1993)), the prototype of which is the 7 
amino acid NLS derived from the T-antrgen of the SV40 virus (Kalderan et. al, 
Cell, 39, 499 - 509 (1984)). 

20 

It was believed that the fusion of such an NLS peptide to a recombinase possibly 
would enhance the efficiency of the recombinase by mediating its import into the 
nucleus and therewith increasing the concentration of the recombinase inside the 
nucleus. However, for Cre recombinase it has been shown that the addition of 

25 the SV-40 T-antigen NLS does not improve its recombination efficiency in 
mammalian cells (Le et aL, Nucleic Acid Res., 27, 4703 -4709 (1999)). 
Nevertheless, both Cre and a Cre-NLS-fusion protein are widely used. Schwikardi 
(Schwikardi et aL, FEBS Lett, 471, pp 147-50 (2000)) reported a yS-resolvase-SV- 
40 T-antigen NLS fusion protein, which also did not enhance the recombination 

30 efficiency. 

The level of activity exhibited by recombinases of diverse prokaryotic origin in 
mammalian cells may be the result of the intrinsic properties of an enzyme 
depending on parameters like its temperature optimum, its target site affinity, 
35 protein structure and stability, the degree of cooperativity, the stability of the 

SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

7 

synaptic complex and the dependence on coproteins or supercoiled DNA. Within 
the specific environment of mammalian cells the activity of a prokaryotic 
recombinase could be limited by additional factors such as a short half-life of the 
recombinase transcript, a short half-rlife of its protein, its inability to act on 
5 histone-cornplexed and higher order structured mammalian genomic DNA, 
exclusion from the nucleus or the recognition of cryptic splice sites within its 
mRNA resulting in a nonfunctional transcript. Due to the lack of information on 
the parameters listed above for almost all recombinases it is presently not 
possible to rationally optimise their performance in mammalian cells. 

0 

Summary of the Invention 



The object to be solved by the invention of the present application is the 
provision of a recombination system alternative to the Cre/loxP system, which 

15 has a different specificity but an efficiency comparable to Cre/loxP. Such an 
alternative recombination system is particularly desirable for all those 
applications which require more than one potent recombination system for being 
successfully carried out (e.g. the methods disclosed in PCT/EPOI/00060 and 
PCr/EPOO/10162). Most surprisingly, it was found that the above object can be 

20 solved by fusing a signal peptide capable directing the nuclear import 
(hereinafter shortly referred to as nuclear localisation signal sequences (NLS)) to 
specific recombinases. 

In contrast to the wildtype recombinases, the resulting modified recombinases 
25 allow a highly efficient recombination of extrachromosomal and chromosomal 
DNA in mammalian cells, and a highly efficient excision of extrachromosomal and 
chromosomal DNA-stretches, which are flanked by suitable recognition sites for 
said modified recombinases. 



30 The present invention thus provides: 

(1) A fusion protein (hereinafter also referred to as "modified recombinase") 
comprising 

(a) a recombinase domain comprising a recombinase protein or fragment thereof 
and 

35 (b) a signal peptide domain being linked to (a) and directing the nuclear import 
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of said fusion protein in eucaryotic cells, 

preferably the activity of the fusion protein in eucaryotic cells is significantly 
higher as compared to the acitivity of the wildtype recombinase corresponding 
to the recombinase of the recombinase domain; 
5 (2) in a preferred embodiment of the fusion protein defined in (1) above, the 
recombinase domain comprises an Integrase protein, preferably a phage <3>C3i 
integrase (C31-Int) protefn or a mutant thereof; 

(3) a DNA coding for the fusion protein as defined in (1) or (2) above; 

(4) a vector containing the DNA as defined in (3) above; 

10 (5) a microorganism containing the DNA of (3) above and/or the vector of (4) 
above; 

(6) a process for preparing the fusion protein as defined in (1) or (2) above 
which comprises culturing a microorganism as defined in (5) above; 

(7) the use of the fusion protein as defined in (1) or (2) above to recombine DNA 
15 molecules, which contain recombinase recognition sequences for the 

recombinase protein of the recombinase domain, in eucaryotic cells; 

(8) a celf, preferably a mammalian ceil containing the DNA sequence of (3) above 
in its genome; 

(9) the use of the cell of (8) above for studying the function of genes and for the 
20 creation of transgenic organisms; 

(10) a transgenic organism, preferably a transgenic mammal containing the DNA 
sequence of (3) above in its genome; 

(11) the use of the transgenic organism of (10) above for studying gene function 
at various developmental stages; and 

25 (12) a method for recombining DNA molecules of ceils or organisms containing 
recognition sequences for the recombinase protein of the recombinase domain as 
defined in (1) or (2) above, which method comprises supplying the cells or 
organisms with a fusion protein as defined in (1) or (2) above, or with a DNA 
sequence of (3) above and/or a vector of (4) above which are capable of 

30 expressing said fusion protein in the cell or organism. 

The present invention combines the use of prokaryotic recombinases such as the 
C31~Int with a eukaryotic signal sequence which increases its efficiency in 
mammalian cells such that it is equal to the widely used Cre/)oxP recombination 
35 system. The improved recombination system of the present invention provides 
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an alternative recombination system for use in mammalian cells and organisms 
which allows to perform the same types of genomic modifications as shown for 
Cre/loxP, including conditional gene inactivation by recombinase-mediated 
deletion, the conditional activation of transgenes in mice, chromosome 
5 engineering to obtain deletion, translocation or inversion, the simple removal of 
selection marker genes, gene replacement, the targeted insertion of transgenes 
and the (in)activation of genes by inversion. 

Short Description of Figures 

10 Fig, 1: C31-Int and Cre recombinase expression vectors and a recombinase 
reporter vector used for transient and stable transfections 

Fia, 2: Results of transient transfections of C31 Int and Cre expression vectors 
and reporter vectors into CHO cells. 

15 

Fig, 3: Results of transient transfections of XisA and Ssv recombinase 
expression vectors with and without nuclear localisation signals and reporter 
vectors into CHO cells. 

20 Fig. 4: Results of transient transfections of C31-Int and Cre recombinase vectors 
into a stable reporter cell line. 

Fig- 5: In situ detection of 6-galactosidase in 3T3(pRK64)-3 cells transfected with 
recombinase expression vectors 

25 

Fia- 6: Test vector for C31-Int mediated deletion, pRK64, and the expected 
deletion product. 

Fig. 7: PGR products generated with the primers P64-1 and P64-4 and sequence 
30 comparison. 

Fig, 8: ROSA26 locus of the C31 reporter mice carrying a C31 reporter construct. 

Fig. 9: In situ detection of ft-galactosldase in a cryosection of the testis of: (A) a 
35 double transgenic mouse carrying both the recombinase and the reporter; and 
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(B) a transgenic mouse carrying only the reporter as a control. 



Detailed Description of the Invention 

The "organisms" according to the present invention are multi-cell organisms and 
5 can be vertebrates such as mammals (humans and non-human animals including 
rodents such as mice or rats) or non-mammals (e.g. fish), or can be 
Invertebrates such as insects or worms, or can be plants (higher plants, aigi or 
fungi). Most preferred living organisms are mice and fish. 

10 "Cells" and "eucaryotic cells" according to the present invention include cells 
Isolated from the above defined living organism and cultured in vitro. These cells 
can be transformed (immortalized) or untransfonjied (directly derived from the 
living organism; primary cell culture). 

15 "Microorganism" according to the present invention relates to procaryotes (e.g. 
E. coif) and eucaryotic microorganisms (e.g. yeasts). 

According to embodiment (1) of the present Invention, the activity of the fusion 
protein in eucaryotic cells is significantly higher as compared to the acltivity of 

20 the wlldtype recombinase corresponding to the recombinase of the recombinase 
domain. A "significantly higher activity" in accordance with the present invention 
refers to an increase in activity of at least 50%, preferably at least 75%, more 
preferably at least 100% relative to the corresponding wildtyp recombinase in 
eucaryotic cells. A "significantly higher actlvty" also implies that the resulting 

25 fusion protein has at least 25%, preferably at least 50% and more preferably at 
least 75%, of the activity of Cre/loxP in 3T3 cells with a stably integrated target 
sequence. 

Recombinase proteins which can be used in the recombinase domain of the 
30 fusion protein of the present invention (i.e., giving a fusion having a "significantly 
higher activty" as defined above) include , but are not limited to, a certain type 
of recombinases belonging to the famiiy of of large serine recombinases (Thorpe 
et ak, Controf of directionalty in the site-specific recombination system of the 
streptomyces phage <j>C31, Molecular Microbiology 38(2), 232-241 (2000))- This 
35 family includes bacteriophage OC31 integrase ("C31-Int"; the amino acid 
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sequence of said integrase and a DNA sequence coding therefor are shown in 
SEQ ID NOs:21 and 20, respectively), coliphage P4 recombinase, Listeria phage 
recombinase, bacteriophage R4 Sre recombinase ("R4 Sre" deposited under GI 
793758; the amino acid sequence of said recombinase and a DNA sequence 
5 coding therefor are shown in SEQ ID NOs:55 and 54, respectively), bacillus 
subtilis CisA recombinase ("CisA" deposited under GI 142689; the amino acid 
sequence of said recombinase and a DNA sequence coding therefor are shown in 
SEQ ID NOs;57 and 56, respectively), XIsF recombinase from annabaena sp. 
Strain PCC 7120 (Cyanobacterlum; "XisF deposited under GI 349678; the amino 

10 acid sequence of said integrase and a DNA sequence coding therefor are shown 
In SEQ ID NOs:59 and 58, respectively), transposon Tn4451 TnpX recombinase 
('TnpX" deposited under GI 551135; the amino acid sequence of said 
recombinase and a DNA sequence coding therefor are shown in SEQ ID NOs:61 
and 60, respectively), "XisA" recombinase from annabaena sp, Strain PCC 7120 

15 (Cyanobacterium; the amino acid sequence of said recombinase and a DNA 
sequence coding therefor are shown in SEQ ID NOs:63 and 62, respectively), 
"SSV" recombinase from phage of sulfolobus shibatae (the amino acid sequence 
of said recombinase and a DNA sequence coding therefor are shown in SEQ ID 
NOs:65 and 64, respectively), lactococcai bacteriophage TP901-1 recombinase 

20 (TP901-1 complete genome deposited under GI 13786531; the amino acid 
sequence of said recombinase and a DNA sequence coding therefor are shown in 
SEQ ID NOs:108 and 107, respectively), and the like, or mutants thereof. Other 
procaryotic recombinases known in the art are also applicable. 

25 A "mutant" of the above recombinases in accordance with the present invention 
relates to a mutant of the respective original (viz. wild-type) recombinase having 
a recombinase activity similar (e.g. at least about 90%) to that of said wild -type 
recombinase. Mutants include truncated forms of the recombinase (such as N- or 
C-terminal truncated recombinase proteins), deletion-type mutants (where one 

30 or more amino acid residues or segments having more than one continuous 
amino acid residue have been deleted from the primary sequence of the wildtyp 
recombinase), replacement-type mutants (where one or more amino acid 
residues or segments of the primary sequence of the wildtyp recombinase have 
been replaced with alternative amino acid residues or segments), or 

35 combinations thereof. 
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According to embodiment (2) of the invention, the recombinase domain 
comprises an integrase protein, preferably a phage <J>C31 Integrase (C31-Int) 
protein or a mutant thereof. Thus, the present Invention provides a fusion protein 
5 comprising 

(a) an integrase domain being a C31-Int protein or a mutant thereof, and 

(b) a signal peptide domain being linked to (a) and directing the nuclear import 
of said fusion protein into eucaryotic cells. 

10 In the fusion protein of embodiment (2), the integrase domain is preferably a 
C31-Int having the amino acid sequence shown in SEQ ID NO: 21 or a C-terminal 
truncated form thereof. Suitable truncated forms of the C31-Int comprise amino 
acid residues 306 to 613 of SEQ ID NO:21. 

15 The signal peptide domain (hereinafter also referred to as "NLS") is preferably 
derived from yeast GAL4, SKI3, L29 or histone H2B proteins, polyoma virus large 
T protein, VP1 or VP2 capsid protein, SV40 VP1 or VP2 capsid protein, Adenovirus 
Ela or DBP protein, influenza virus NS1 protein, hepatitis virus core antigen or 
the mammalian lamin, o-myc, max, c-myb, p53, c-erbA, jun, Taix, steroid 

20 receptor or Mx proteins (see Boulikas, Crit. Rev. Eucar. Gene Expression, 3, 193 

- 227 (1993)), simian virus 40 ("SV40") T-antigen (Kalderon et. al, Cell, 39, 499 

- 509 (1984)) or other proteins with known nuclear localisation. The NLS is 
preferably derived from the SV40 T-antigen. 

25 Furthermore, the signal peptide domain preferably has a length of 5 to 74, 
preferably 7 to 15 amino acid residues. More preferred is that the signal peptide 
domain comprises a segment of 6 amino acid residues wherein at least 2 amino 
acid residues, preferably at least 3 amino acid residues are positively charged 
basic amino acids. Basic amino acids Include, but are not limited to, Lysin, Arginin 

30 and Histidine. Particularly preferred signal peptides are show in the following 
table. 

Organism Sequence/(SEQ ID NO:) 

yeast GAL4 MKxllCRLKKLKCSKEKPKCAKCLKx5Rx3KTKR (24) 

35 yeast SKI3 IKYFKKFPKD (25) 
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yeast L29 MTGSKTRKHRGSGA (26) 

(MTGSKHRKHPGSGA) (27) 

yeast histone H2B (GKKRSKA) (28) 

polyoma virus large T protein (PKKAREDVSRKRPR) (29) 

5 polyoma virus VP1 capsid protein (APKRKSGVSKC) (30) 

polyoma virus VP2 capsid protein ( EE DGPQKKKRRL) (31) 

SV40 VP1 capsid protein (APTKRKGS) (32) 

SV40 VP2 capsid protein, (PNKKKRK) (33) 

Adenovirus Ela protein (KRPRP) (34) 

10 (CGGLSSKRPRP) (35) 

Adenovirus DBP protein (PPKKRM RRRIEPKKKKKRP) (36) 

influenza virus NS1 protein (PFLDRLRRDQK) (37) 

(PKQKRKMAR) (38) 

human laminA (SVTKKRKLE) (39) 

15 human c-myc (CGGAAKRVKLD) (40) 

(PAAKRVKLD) (41) 

(RQRRNELKRSP) (42) 

HUMAN max (PQSRKKLR) (43) 

HUMAN c-myb (PLLKKIKQ) (44) 

20 HUMAN p53 (PQPKKKP) (45) 

HUMAN c-erbA (SKRVAKRKL) (46) 

VIRAL jun (ASKSRKRKL) (47) 

HUMAN Tax (GGLCSARLHRHALLAT) (48) 

Mammalian glucocorticoid receptor (RKTKKKIK) (49) 

25 HUMAN ANDROGEN RECEPTOR (RKLKKLGN) (50) 

MAMMALIAN ESTROGEN RECEPTOR (RKDRRGGR) (51) 

Mx proteins (DTRE KKKFLKRRLLRLD E) (52) 

SV40 T-antigen (PKKKRKV) (53) 

30 The most preferred signal peptide domain is that of SV40 T-antigen having the 
sequence Pro-Lys-Lys-Lys-Arg-Lys-Val. 

The signal peptide domain may be linked to the N-terminal or C-terminal of the 
integrase domain or may be integrated into the integrase domain, preferably the 
35 signal peptide domain is linked to the C-terminal of the integrase domain. With 
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regard to phage <PC31 integrase protein of embodiment (2) of the invention it 
was found that the fusion of an NLS-peptide to the C-terminus of the integrase 
provided a much higher increase of activity as compared to the fusion of the 
same NLS-peptide to the N-terminus of the integrase (see Example 1, figures 3 
5 and 4). 

According to the present invention, the signal peptide domain may be linked to 
the integrase domain directly or through a linker peptide. Suitable linkers include 
peptides having from 1 to 30, preferably 1 to 15 amino acid residues, said amino 
10 acid residues being essentially neutral amino acids such as Gly, AJa and Val. 

The most preferred fusion protein of the present invention comprises the amino 
acid sequence shown in SEQ ID NO: 23 (a suitable DNA sequence coding for said 
fusion protein being shown in SEQ ID NO:22). 

15 

Further preferred fusion proteins of the present invention are "NLS-XisA" and 
"NLS-SSV" (having the NLS-peptide fused to the N-terminus of the 
recombinases) as shown in SEQ ID NO:67 and 69, respectively (suitable DNA 
sequences coding for said fusion proteins being shown in SEQ ID NO: 66 and 68, 
20 respectively). 

In embodiments (7), (8), (10) and (12) of the invention the DNA molecules, the 
cell or transgenic organism may also contain recognition sequences for the 
recombinase protein of the recombinase domain. Thus, when utilizing the fusion 
25 protein of embodiment (2), the C31-Int recognition sequences attP and attB are 
present in DNA molecules, the cell or transgenic organism. 

The term "mammal" as used in embodiment (10) of the invention includes non- 
human mammais (viz. animals as defined above) and humans (if such subject 
30 matter is patentable with the respective patent authority). 

Since the modified recombinase of the invention, in particular the modified C31- 
Int, acts in mammalian cells as efficient (or at least almost as efficient) as the 
widely used Cre/loxP system it can be used for a large variety of genomic 
35 modifications (including the methods disclosed in PCT/EPpl/00060 and 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

15 

PCT/EP00/10162, the content of which is herewith incorporated by reference). 
Concerning embodiment (11) it is to be noted that the mammals of embodiment 
(10) can be used to study the function of genes, e.g. in mice, by conditional gene 
targeting. For this purpose suitable recognition sequences - when utilizing the 
5 fusion protein of embodiment (2), one attP and one attB site (C31-Int recognition 
sequences) in the same orientation - can be introduced into introns of a gene by 
homologous recombination of a gene targeting vector in ES cells such that the 
two sites flank one or more exons of the gene to be studied but do not interfere 
with gene expression. A selection marker gene, needed to isolate recombinant ES 

10 cell clones, can be flanked by two recognition sites of another recombinase such 
as loxP or FRT sites to enable deletion of the marker gene upon transient 
expression of the respective recombinase in ES cells. These ES cells can be used 
to generate germline chimaeric mice which transmit the target gene modified by 
att sites to their offspring and allow to establish a modified mouse strain. The 

15 crossing of this strain with a C31-Int recombinase transgenic line or the 
application of C31-Int protein will result in the deletion of the att-flanked gene 
segment from the genome of doubly transgenic offspring and the inactivation of 
the target gene in doubly transgenic offspring in a prespecified temporally and/or 
spatially restricted manner. The C31-Int transgenic strain contains a transgene 

20 whose expression is either constitutiveiy active in certain cells and tissues or is 
inducible by external agents, depending on the promoter region used. If an attB 
and an attP site are placed into the genome In opposite orientation C31-Int 
mediated recombination results in the irreversible inversion of the flanked gene 
segment leading the functional loss of on or more exons of the target gene. 

25 Thus, the method allows the analysis of gene function in particular cell types and 
tissues of otherwise widely expressed genes and circumvents embryonic lethality 
which is often the consequence of complete (germline) gene inactivation. For the' 
validation of genes and their products for drug development, gene inactivation 
which is inducible in adults provides an excellent genetic tool as this mimicks the 

30 biological effects of target inhibition upon drug application. If a pair of attB/P 
sites is placed in the same or opposite orientation into a chromosome at large 
distance using two gene targeting vectors, C31-Int recombination allows to 
delete or invert chromosome segments containing one or more genes, or 
chromosomal translocations if the two sites are located on different 

35 chromosomes. In another application of the method a pair of attB/P sites is 
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placed In the same orientation within a transgene such that the deletion of the 
atMlanked DNA segment results in gene expression, e.g. of a toxin or reporter 
gene for cell lineage studies, or In the inactivation of the transgene. 

5 In addition, according with embodiment (12) of the invention, the recombination 
system of embodiment (1), In particular the C31-Int recombination system of 
embodiment (2), can also be used for. the site specific integration of foreign DNA 
into the genome of mammalian cells, e.g. for gene therapy. For this purpose, and 
if the C31-Int recombination system of embodiment (2) is utilized, only one, attB 
10 (or attP) site is initially introduced into the genome by homologous 
recombination, or an endogenous genomic sequence which resembles attB or 
attP is used , The application of a vector containing an attP (or attB) site to such 
cells or mice in conjunction with the expression of C31-Int recombinase will lead 
to the site specific integration of the vector into the genomic att site. 

15 

Thus, the present invention provides a process which enables the highly efficient 
modification of the genome of mammafian cells by site-specific recombination. 
Said process possesses the following advantages over current technology; 

20 (i) the modified recombinase, in particular the modified C31-Integrase, allows 
to recombine extrachromosornal and genomic DNA in mammalian cells at 
much higher efficiency as compared to the use of its wildtype form; 

(ii) the modified recombinase, in particular the modified C31-Integrase, is the 
25 first described alternative recombination system with equal efficiency to 

Cre/IoxP for the recombination of chromosomal DNA in mammalian cells. 

The appended figures further explain the present invention: 

30 Figure 1 shows C31-Int and Cre recombinase expression vectors and a 
recombinase reporter vector used for transient and stable transfections. 
A~D: Mammalian expression vectors for recombinases which contain the CIW 
immediate early promoter followed by a hybrid intron, the coding region of the 
recombinase to be tested, and an artificial polyadenylation signal sequence (pA). 
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A: pCMV-C31Int(wt) containing the nonmodlfied (wifdtype) 1.85 kb coding region 
of C31«Int as found in the genome of phage <EX31. 

B: pCMV-C31Int(NNLS) containing a modified C31-Int gene coding for the full 
length C31-Int protein with a N-terminal fusion to the SV40 virus large T antigen 
5 nuclear localisation signal (NLS). 

C: pCMV-C31Int(CNLS) containing a modified C31-Int gene coding for the full 
length C31-Int protein with a C-terminal fusion to the SV40 virus large T antigen 
nuclear localisation signal (NLS), 

D: pCMV-Cre contains the 1.1 kb Cre coding region with an N-terminal fusion to 

10 the SV40 T antigen NLS. 

E; Recombination substrate vector pRK64 contains a SV40 promoter region 
followed by a 1.1 kb cassette consisting of the coding region of the puromycin 
resistance gene and a polyadenylatfon signal sequence, flanked 5" by the 84 bp 
attB and 3' by the 84 bp attP recognition site of C31-Int, pRK64 contains in 

15 addition two Cre recognition (loxP) sites in direct orientation next to the att sites. 

Figure 2 shows results of transient transfectfons of C31-Int and Cre recombinase 
and reporter vectors Into CHO cells. 

All transfections were performed with a fixed amount of the reporter plasmid 
20 pRK64 and 0,5 ng or 1 ng of the recombinase expression plasrnids pCMV-C31- 
Int(wt) (samples 4-5), pCMV-C31-Int(NNLS) (samples 6-7), pCMV-C31- 
Int(CNLS) (samples 8-9) or pCMV-Cre (samples 10-11), Negative controls: 
transfection with pRK64 (sample 3) or pUC19 alone (sample 1), Positive control: 
transfection with the Cre-recombined reporter pRK64(ACre) (sample 2), 
25 The vertical rows show the mean values and standard deviation of "Relative Light 
Units" obtained from lysates with the assay for 6-galactosidase (RLU (B-Gal)), 
the RLU from the assay for Luciferase, the ratio of the 6-galactosidase and 
Luciferase values with standard deviation (RLU x 10 s (Gal/Luc)), and the relative 
activity of the various recomblnases as compared to the positive control defined 
30 as 1. 

Figure 3 shows results of transient transfections of XisA and Ssv recomblnases 
and reporter vectors into CHO cells. 

All transfections were performed with fixed amounts of the reporter plasrnids 
35 pPGKnif (for XisA) and pPGKattA (for SSV) and 25 ng or 100 ng of the 
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recombinase expression plasmids pCMV-XisA, pCMV-XisA(NNLS) and 10 ng or 20 
ng of the expression pfasmids pCMV-Ssv and pCMV-Ssv(NNLS). Negative 
controls: transfection with pPGKnif or pPGKattA alone. 

The vertical rows show the mean values and standard deviation of "Relative Light 
5 Units" obtained from [ysates with the assay for 13-galactosidase (RLU (B-Gal)), 
the RLU from the assay for Luciferase, the ratio of the 6-galactosidase and 
"Luciferase" values with standard deviation (RLU x 10 s (Gal/Luc)). 

Figure 4 shows results of transient transfections of recombinase vectors into a 

10 stable reporter cell line. 

All transfections were performed with a NIH 3T3 derived clone containing stably 
integrated copies of the pRK64 recombination substrate vector. Either 32 ng or 
64 ng of the recombinase expression plasmids pCMV-C31-Int(wt) (samples 2-3), 
pCMV-C31-Int(NNLS) (samples 4-5), pCMV-C31-Int(CNLS) (samples 6-7) or 

15 pCMV-Cre(NNLS) (samples 8-9). Negative control: transfection with pUC19 alone 
(sample 1). 

The vertical rows show the mean values and standard deviation of "Relative Light 
Units" obtained from lysates with the assay for (3-galactosidase (RLU (S-Gal)) and 
the relative activity of the various recombinases as compared to the value 
20 obtained with pCMV-Cre(NNLS) defined as 1. 

Figure 5 shows the in situ detection of B-galactosidase in 3T3(pRK64)-3 cells 
transfected with recombinase expression vectors- 

The Cre and C31-Int recombinase reporter cell line 3T3(pRK64)-3 was either not 
25 transfected with DNA (A), transfected with the Cre expression vector pCMV-Cre 
(B) or with the C31-Int expression vector pCMV-C31-Int(CNLS). Two days after 
tranfection the cells were fixed and incubated with the histochemica! X-Gal assay 
which develops a blue stain in B-galactosidase positive cells indicating 
recombinase mediated activation of the reporter gene. 

30 

Figure 6 shows the test vector for C31-Int mediated deletion, pRK64, and the 
expected product of deletion, pRK64(AInt), 

Plasmrd pRK64 contains the 1.1 kb cassette of the coding region of the 
puromycin resistance gene and a polyadenylation signal, which is flanked 5' by 
35 the 84 bp attB and 3 ' by the 84 bp attP recognition site (large triangles) of C31- 
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Int. These attB and attp sites are oriented in the same way to each other (thick 
black arrows) which is used by the <©X31 phage to Integrate into the bacterial 
genome. In addition, the cassette is flanked by two Cre recombinase recognition 
(loxP) sites in the same orientation (black small triangles). For better orientation 
5 the half sites of the att sequences are labelled by a direction (thin arrow) and 
numbered 1-4, The 3 bp sequence within the att sites at which recombination 
occurs is framed by a box. The positions at which the PGR primers P64-1 and 
P64-4 hybridise to the pRK64 vector are indicated by arrows, pointing into the 
3" direction of both oligonucleotides. 
10 PRK64(AInt) depicts the deletion product expected from the C31-Int mediated 
recombination between the att sites of pRK64p The recombination between a pair 
of attB/attP sites generates an attR site remaining on theparenta! DNA molecule 
while the puromycln cassette is excised. In this configuration the primers P64-1 
and P64-4 will amplify a PGR product of 630 bp from pRK64(AInt). 

15 

Figure 7 shows PCR products generated with the primers P64-1 and P64-4 and a 
sequence comparison of the PCR product, 

A: Analysis of PCR products on an agarose gel from PCR reactions using the 
Primers P64-1 and P64-4 on DNA extracted from MEF5-5 ceils transfected 2 days 

20 before with plasmid pRK64 alone (lane 4), with pRK64 + CMV-Cre (lane 3), with 
pRK64 + pCMV-C31-Int(wt) (lane 2), and from a control reaction which did not 
contain cellular DNA (lane 1). The product with an apparent size around 650 bp, 
as compared to the size marker used, from lane 2 was excised from the agarose 
gel and purified. The PCR product was cloned into a sequencing plasmid vector 

25 and gave rise to the plasmid pRK80d. The insert of this plasmid was sequenced 
using reverse primer (seqSOd) and compared to the predicted sequence of the 
pRK64 vector after C31~Int mediated deletion of the att flanked cassette, 
pRK64(AInt). The cloned PCR product shows a 100% identity with the predicted 
attR sequence after deletion. The generated attR site is shown in a box, with the 

30 same sequence designation used in Figure 5, The nucleotide positions (pos.) of 
the compared sequences pRK64(Alnt) and SeqSOd are indicated. 

Figure 8 shows the modified ROSA26 locus of C31 reporter mice (Seq ID 
NO: 106). A recombination substrate has been Inserted in the ROSA26 locus. The 
35 substate consists of a splice acceptor (SA) followed by a cassette consisting of 
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the hygromycfn resistance gene driven by a PGK promoter and flanked by the 
recombination sites attB and attP. In addition the reporter contains two Cre 
recognition sites (loxP) in direct orientation next to the att sites. This cassette is 
followed by the coding region for p-galactosidase, which is only expressed when 
5 the hygromycin resistance gene has been deleted by recombination. 

Figure 9 shows the In situ detection of p-galactosidase activity. A cryosection of 
the testis of a double transgenic mouse carrying both the C31-int recombinase 
and the recombination substrate was stained with X-Gal (A). The blue colour 
10 indicates recombination of the substrate, which leads to the expression of p- 
galactosidase. As a control a cryosection of testis of a transgenic mouse carrying 
only the recombination substrate was stained with X-Gal (B). 

The present invention is further illustrated by the following Examples which are 7 
15 however, not to be construed as to limit the invention. 

Examples 

Example 1 

20 As compared to Cre recombinase the wildtype form of C31-Int exhibits a 
significantly lower recombination activity in mammalian cells which falls in the 
range of 10 - 40% of Cre, depending on the assay system used (see below). As 
a measure which may increase C31-Int efficiency in eukaryotic cells we designed 
mammalian expression vectors for N- or C-terminal fusion proteins of C31-Int 

25 with a peptide was designed which is recognised by the nuclear import 
machinery. The recombination efficiency obtained by this modified C31-Int 
recombinase in mammalian cells was compared side by side to the unmodified 
(wildtype) form of C31-Int and to Cre recombinase. For the quantification of 
recombinase activities the expression vectors were transiently introduced into a 

30 mammalian cell line together with a reporter vector which contains C31-Int and 
Cre target sites and leads to the expression of B-galactosidase upon recombinase 
mediated deletion of a vector segment flanked by recombinase recognition sites. 

A. Plasmid constructions: 
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Construction of the recombination test vectors pPGKnif and pPGKattA: first a nifD 
site (Haselkorn, Annu Rev.Genet. 26, 113-130 (1992)) generated by the 
annealing of the two synthetic oligonucleotides nifD3 (SEQ ID NO:89) and nifD4 
(SEQ ID NO:90), was iigated into the BamHI restriction site of the vector PSV- 
5 Paxl (Buchholz et al., Nucleic Acids Res., 24, 4256-4262 (1996)), 3 'of its 
puromycin resistance gene and loxP site, giving rise to plasmid pPGKnifD3' (SEQ 
ID NO: 79). Next, another nifD site, generated by the annealing of the two 
synthetic oligonucleotides riifDl (SEQ ID NO:87) and nifD2 (SEQ ID NO:88), was 
Iigated into the BstBI restriction site of plasmid pPGKnifD3', upstream of the 

10 puromycin resistance gene and loxP site, giving rise to plasmid pPGKnifD (SEQ ID 
NO:78). For pPGKattA (Muskhelishvili et al., Mol.Gen.Genet. 237, 334-342 
(1993)) first a 3 52 bp-fragment was amplified from genomic DNA from the 
thermophilic bacterium Sulfolobus shibatae (DSM-5389, DSMZ Braunschweig- 
Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH, Mascheroder 

15 Weg lb, D-38124 Braunschweig, Germany) with oligonucleotides SSV5 (SEQ ID 
NO:96) and SSV6 (SEQ ID NO:97) including restriction sites for BamHI and 
BstBI. The amplified fragment was cloned into the BamHI site of the vector PSV- 
Paxl giving rise to plasmid pPGKattAl (SEQ ID NO:82), subsequently the same 
352 bp-fragment was cloned into the BstBI site of pPGKattAl giving rise to the 

20 plasmid pPGKattA2 (SEQ ID NO:83). The sequence and orientation of both nifD 
sites and attA sites was confirmed by DNA sequence analysis. In 
pPGKnifD/pPGKattA2 the newly cloned nifD/attA sites (positions 535-619 and 
1722-1787/ positions 6718-7081 and 12-363) are in the same orientation 
flanking the puromycin resistance gene and the SV40 early polyadenylation 

25 sequence. The nifD/attA sites are followed by loxP sites in the same orientation 
(positions 623 - 656 and 1794 - 1827/ positions 7085-7118 and 369-402). The 
puromycin cassette is transcribed from the SV40 early enhancer/promoter region 
and followed by the coding region for E. coli 6-galactosidase and the SV40 late 
region polyadenylation sequence. 

30 

Construction of XisA and SSV expression vectors: First the XisA gene of 
cyanobacterium PCC7120 was amplified by PCR from genomic DNA from Nostoc 
strain PCC7120 (CN CM -Col lection Nationale de Cultures de Microorganlsmes, 
Institut Pasteur, Paris) using the primers XisAl (SEQ ID NO:84) and XisA3 (SEQ 
35 ID NO:86), and XisAl (SEQ ID NO:84) and XisA2 (SEQ ID NO:85) (with NLS). 
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The ends of the PCR product were digested with NotI and the product was ligated 
into plasmid pBluescript II KS, opened with NotI, giving rise to plasmids pRK42a 
and pRK43 (with NNLS). The DNA sequence of the insert was determined and 
found to be identical to the published XisA sequence (Genbank GI: 3953452) 
apart from four silent point mutations. The XisA gene was isolated as a 1.4 kb 
fragment from pRK42a and pRK43 by digestion with NotI and ligated into the 
generic mammalian expression vector pRK50 (see below), opened with NotI, 
giving rise to the XisA expression vectors pCMV-XisA (SEQ ID NO: 76) and pCMV- 
XisA(NNLS) (SEQ ID NO:77). pCMV-XisA(wt) contains a Cytomegalovirus 
immediated early gene promoter (position 1 - 616), a 240 bp hybrid intron 
(position 716 - 953), the XisA gene (position 974 - 2392), and a synthetic 
polyadenylation sequence (position 2413 - 2591). 

The SSV gene was amplified from genomic DNA from the thermophilic bacterium 
Sulfolobus shibatae (DSM-5389, DSMZ Braunschweig- Deutsche Sammlung von 
Mikroorganismen und Zellkulturen GmbH, Mascheroder Weg lb, D-38124 
Braunschweig, Germany) in two PCR steps because of an internal attP sequence. 
First, two overlapping PCR fragments were created with the oligonucleotides 
SSV1-1 (SEQ ID NO:91) (or SSV1-2 for the SSV(NNLS) gene) and SSV2 (SEQ ID 
NO:93) and oligonucleotides SSV3 (SEQ ID NO:94) and SSV4 (SEQ ID NO:95). 
Using these overlapping fragments as template, a lOOObp fragment containing 
the complete SSV coding sequence was amplified with primers SSV1-1 (or SSV1- 
2 for the SSV(NNLS) gene) and SSV4. The 5' 620 bp-fragments of these PCR 
products were isolated by digestion with Notl-Xhol and cloned into vector 
pBluescript II KS giving rise to plasmids pRK47 and pRK48 (with NLS). The 3' 
380 bp fragment generated by Xhol-digestion was cloned into the Xhol 
restriction site of vector pBluescript II KS giving rise to the plasmid pBS-SSVs 
(SEQ ID NO:72). The 380bp SSV-fragment was then isolated by digestion of 
pBS-SSVs with Xhol and ligated into pRK47 and pRK48 opened by Xhol giving 
rise to plasmids pBS-SSV3 (SEQ ID NO:70) and pBS-SSV4 (SEQ ID NO: 71) (with 
NLS) containing the complete SSV gene. Sequencing of the plasmids confirmed 
one point mutation in both plasmids. Therefore 312 bp/ 91 bp fragments 
generated by digestion with EcoRV-Smal/ EcoRV-XhoI of another clone of pRK47 
were exchanged In plasmids pBS-SSV3/ pBS-SSV4. Sequences were confirmed 
by sequencing. The SSV gene was isolated from pRK47 and pRK48 by digestion 
with NotI and Kpnl and ligated into the generic mammalian expression vector 
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pRK50 (see below), opened with NotI and Sail, giving rise to the SSV expression 
vectors pCMV-SSV(wt) (SEQ ID NO: 74) and pCMV-SSV(NNLS) (SEQ ID NO:75). 

Construction of the recombination test vector pRK64: first an attB site (Thorpe et 
5 a!. Proc. Natl- Acad. Sci. USA, 95, 5505 - 5510 (1998)), generated by the 
annealing of the two synthetic oligonucleotides C31-4 (SEQ ID NO:l) apd C31-5 
(SEQ ID NO:2), was ligated into the BstBI restriction site of the vector PSV-Paxl 
(Buchholz et at., Nucleic Acids Res,, 24, 4256-4262 (1996)), 5' of its puromycin 
resistance gene and loxP site, giving rise to plasmid pRK52, The sequence and 

10 orientation of the cloned attB site was confirmed by DNA sequence analysis. 
Next, an attP site site (Thorpe et al. Proc. Natl. Acad- Sci. USA, 95, 5505 - 5510 
(1998)), generated by the annealing of the two synthetic oligonucleotides C31-6 
(SEQ ID NO:3) and C31-7-2 (SEQ ID NO:4), was ligated into the BamHI 
restriction site of plasmid pRK52, downstream of the puromycin resistance gene 

15 and loxP site, giving rise to plasmid pRK64 (SEQ ID NO:5). The sequence and 
orientation of the attP site was confirmed by DNA sequence analysis. In pRK64 
the newly cloned attB (position 348 - 431) and attP (position 1534 - 1617) sites 
are in the same orientation flanking the puromycin resistance gene and the SV40 
early polyadenylation sequence. The attB and attP sites are followed by loxP sites 

20 in the same orientation (positions 435 - 469 and 1624 - 1658). The puromycin 
cassette is transcribed from the SV40 early enhancer/promoter region and 
followed by the coding region for E. coli B-galactosidase and the SV40 late region 
polyadenylation sequence. 

25 Construction of C31-Int expression vectors: First the C31-Int gene of phage 
OC31 was amplified by PCR from phage DNA (DSM-49156, DSMZ-Deutsche 
Sammlung von Mikroorganismen und Zellkulturen GmbH, Mascheroder Weg lb, 
D-38124 Braunschweig, Germany) using the primers C31-1 (SEQ ID NO:6) and 
C31-3 (SEQ ID NO:7). The ends of the PCR product were digested with NotI and 

30 the product was ligated into piasmid pBluescript II KS, opened with NotI, giving 
rise to plasmid pRK40. The DNA sequence of the 1.85 kb insert was determined 
and found to be identical to the published C31-Int gene (Kuhstoss et aL, 3. Mol. 
Biol, 222, 897-908 (1991)), except for an error in the stop codon. This error was 
repaired by PCR amplification of a 300 bp fragment from plasmid pRK40 using 

35 the primers C31-8 (SEQ ID NO:8) and C31-9 (SEQ ID NO:9), which provide a 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

24 

corrected Stop codon. The ends of this PCR fragment were digested with 
Eco47in and Xhol, the fragment was ligated into piasmid pRK40 and opened 
with Eco47III and Xhol to remove the fragment containing the defective stop 
codon. The resulting piasmid pRK55 contains the correct C31-Int gene as 
5 confirmed by DNA sequence analysis. 

The C31-Int gene was isolated from pRK55 as 1.85 kb fragment by digestion with 
NotI and Xhol and ligated Into the generic mammalian expression vector pRKSO 
(see below), opened with NotI and Xhol, giving rise to the C31-Int expression 

10 vector pCMV-C31-Int(wt). pCMV-C31-Int(wt) (SEQ ID NO: 10) contains a 700 bp 
cytomegalovirus immediated early gene promoter (position 1 - 700), a 270 bp 
hybrid intron (position 701 - 970), the C31-Int gene (position 978 - 2819), and 
a 189 bp synthetic polyadenylation sequence (position 2831 - 3020). 
For the construction of pCMV-C31-Int(NNLS) a 1.5 kb fragment was amplified by 

15 PCR from phage DNA using oligonucleotides C31-2 (SEQ ID NO: 98) and C31-3 
(SEQ ID NO:7). The ends of the PCR product were digested with NotI and the 
product was ligated into piasmid pBluescript II KS, opened with NotI, giving rise 
to piasmid pRK41 (SEQ ID NO: 99). A 1100 bp fragment generated by digestion 
of piasmid pRK41 with Ncol and NotI was then ligated into piasmid pRK55 (SEQ 

20 ID NO:80), opened with Ncol and NotI, giving rise to the piasmid pRK63 (SEQ ID 
NO:81). The C31-Int gene with N-terminal NLS was isolated as a 1.8 kb fragment 
from pRK63 by digestion with NotI and Xhol and ligated into the mammalian 
expression vector pRK50, opened with NotI and Xhol, giving rise to the C31-Int 
expression vector pCMV-C31-Int(NNLS). pCMV-C31-Int(NNLS) (SEQ ID NO:73) 

25 contains a 700 bp Cytomegalovirus immediated early gene promoter (position 1 
- 700), a 270 bp hybrid intron (position 701 - 970), the C31-Int gene with N- 
terminal NLS (position 976 - 2838), and a 189 bp synthetic polyadenylation 
sequence (position 2851 - 3040). 

For the construction of pCMV-C31-Int(CNLS), the 3^-end of the C31-Int gene 
30 was amplified from pCMV-C31-Int(wt) as a 300 bp PCR fragment using the 
primers C31-8 (SEQ ID NO:8) and C31-2-2 (SEQ ID NO: 11). Primer C31-2-2 
modifies the 3" -end of the wildtype C31-Int gene such that the stop codon is 
replaced by a sequence of 21 basepairs coding for the SV40 T-anttgen nuclear 
localisation sequence of 7 amino acids (Prolin-Lysin-Lysin-Lysin-Arginin-Lysin- 
35 Valin) (Kalderon et. al. Cell, 39, 499 - 509 (1984)), followed by a new stop 
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codon. The ends of this 300 bp PCR fragment were digested with with Eco47IIi 
and Xhol, the fragment was ligated into plasmid pCMV-C31-Int(wt) and opened 
with Eco47III and Xhol to replace the 3 '-end of the wildtype C31-Int gene 
resulting in the plasmid pCMV-C31-Int(CNLS). The identity of the new gene 
5 segment was verified by DNA sequence analysis. pCMV-C31-Int(CNLS) (SEQ ID 
NO: 12) contains a 700 bp cytomegalovirus immediated early gene promoter 
(position 12 - 711), a 270 bp hybrid intron (position 712 - 981), the modified 
C31-Int gene (position 989 - 2851), and a 189 bp synthetic polyadenylation 
sequence (position 2854 - 3043). 

10 

To generate the Cre expression plasmid pCMV-Cre (SEQ ID NO: 13), the coding 
sequence of Cre recombinase (Sternberg et al., J. Mol. Biol., 187, 197 - 212 
(1986)) with a N-terminal fusion of the 7 amino acid SV40 T-antigen NLS (see 
above) was recovered from plasmid pgk-Cre and cloned into the NotI and Xhol 

15 sites of plasmid pRK50. PRK50 (SEQ ID NO: 14) is a generic expression vector for 
mammalian cells based on the cloning vector pNEB193 (New England Biolabs Inc, 
Beverly, MA, USA). PRK50 was built by insertion into pNEB193 of a 700 bp 
cytomegalovirus immediated early gene (CMV-IE) promoter (position 1-700) 
from plasmid pIREShyg (GenBank#U89672; Clontech Laboratories Inc, Palo Alto, 

20 CA, USA), a synthetic 270 bp hybrid intron (position 701-970), consisting of a 
adenovirus derived splice donor and an IgG derived splice acceptor sequence 
(Choi et al., Mol. Cell. Biol., 11, 3070 - 3074 (1991)), and a 189 bp synthetic 
polyadenylation sequence (position 1000-1188) build from the polyadenylation 
consensus sequence and 4 MAZ polymerase pause sites (Levitt et al., 

25 Genes&Dev., 3, 1019 - 1025 (1989); The EMBO J. 13, 5656 - 5667 (1994)). The 
positive control plasmid pRK64(ACre) (SEQ ID NO: 15) was generated from 
pRK64 by transformation into the Cre expressing E. coli strain 294-Cre (Buchholz 
et al.. Nucleic Acids Res., 24, 3118 - 3119 (1996)). , 

30 One of the transformed subclones was confirmed for the Cre mediated deletion of 
the loxP-flanked cassette by restriction mapping and further expanded. Plasmid 
pUC19 is a cloning vector without eukaryotic control elements used to equalise 
DNA amounts for transfections (GenBank#X02514; Mew England Biolabs Inc, 
Beverly, MA, USA). All plasmids were propagated, in DH5a E. coli cells (Life 

35 Technologies GmbH, Karlsruhe, Germany) grown in Luria-Bertani medium and 
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purified with the plasmid DNA purification reagents "Plasmid-Maxi-Kit" (Quiagen 
GmbH, Hilden, Germany) or "Concert high purity plasmid purification system" 
(Life Technologies GmbH, Karlsruhe, Germany). Following purification, the 
plasmid DNA concentrations were determined by absorption at 260 nm and 280 
5 nm in UVette cuvettes (Eppendorf-Netheler-Hinz GmbH, Hamburg, Germany) 
using a BioPhotometer (Eppendorf-Netheler-Hinz GmbH, Hamburg, Germany) 
and the plasmids were diluted to the same concentration; finally these were 
confirmed by separation of 10 ng of each plasmid on an ethidiumbromide-stained 
agarose gel. 

10 

B. Cell cul ture and transfections: Chinese hamster ovary (CHO) cells (Puck et al., 
J. Exp. Med., 108, 945 (1958)) were obtained from the Institute for Genetics 
(University of Cologne, Germany) as a population adapted to growth In DMEM 
medium. The cells were grown in DMEM/GIutamax medium (Life Technologies) 

15 supplemented with 10% fetai calf serum at 37°C, 10% C0 2 in humid atmosphere 
and passaged upon trypsinisation. One day before transfection 10 6 cells were 
plated into a 48-well plate (Falcon). For the transient transfection of cells with 
plasmids each well received into 250 ml of medium a total amount of 300 ng 
supercoiled plasmid DNA corhplexed before with the FuGene6 transfection 

20 reagent (Roche Diagnostics GmbH, Mannheim, Germany) according to the 
manufacturers protocol. Each 300 ng DNA preparation (Fig.2 sample 4 to 11) 
contained 50 ng of the luciferase expression vector pUHC13-l (Gossen et al., 
Proc Natl Acad Sci USA., 89 5547-5551 (1992)), 50 ng of the substrate vector 
pRK64, 0.5 ng or 1 ng of one of the recombinase expression vectors pCMV- 

25 C31Int(wt), pCMV-C31Int(NNLS), pCMV-C31Int(CNLS) or pCMV-Cre and 199 ng 
or 199.5 ng of pUC19 plasmid, except for the controls which received 50 ng of 
PUHC13-1 together with 50 ng of pRK64 (sample 3) or pRK64(Acre) (sample 2) 
and 200 ng pUC19, or 50 ng pUHC13-l with 250 ng pUC19 (sample 1). 
Transfections of Ssv and XisA recombinases (Fig. 3) also contained 50 ng of the 

30 luciferase expression vector pUHC13-l, 50 ng of substrate vectors pPGKattA and 
pPGKnif and 10 ng or 20 ng of recombinase expression vector pCMV-SSV or 
pCMV-SSV(NNLS) or 25 ng or 100 ng of expression vectors pCMV-XisA/ pCMV- 
XisA(NNLS). Plasmid pUC19 was added to a total amount of 300 ng plasmid DNA. 
As the C31-Int expression vectors are 15% larger in size than pCMV-Cre and the 

35 same amounts of DNA of the three plasmids were used for transfection, the 
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samples with C31-Int vectors received 15% less plasmid molecules as compared 
to the samples with Cre expression vector. The 8-galactosidase values from C31- 
Int transfected samples by this value were not corrected and thus is a slight 
underestimation of the calculated C31-Int activities. For each sample to be tested 
5 four individual wells were transfected. One day after the addition of the DNA 
preparations each well received additional 250 ml of growth medium. The celis of 
each well were lysed 48 hours after transfection with 100 ml lysate reagent 
supplemented with protease inhibitors (Roche Diagnostics). The lysates were 
centrifuged and 20 ml were used to determine the B-galactosidase activities 

10 using the 6-galactosidase reporter gene assay (Roche Diagnostics) according to 
the manufacturers protocol in a Lumat LB 9507 luminometer (Berthold). To 
measure luciferase activity, 20ml lysate was diluted into 250ml assay buffer 
(50mM glycylglycin, 5mM MgCI 2 , 5mM ATP) and the "Relative Light Units" (RLU) 
were counted in a Lumat LB 9507 luminometer after addition of 100 ml of a 1 

15 mM luciferin (Roche Diagnostics) solution. The mean value and standard 
deviation of the samples was calculated from the B-galactosidase and luciferase 
RLU values obtained from the four transfected wells of each sample. 

C. Results: To set up an assay system for the measurement of C31-Int and Cre 

20 recombinase efficiency in mammalian cells the recombination substrate vector 
pRK64 shown in Figure IE was first constructed. pRK64 contains a SV40 
promoter region for expression in mammalian cells followed by a 1.1 kb cassette 
which consists of the coding region of the puromycin resistance gene and a 
polyadenylation signal sequence. This cassette is flanked at the 5' -end by the 84 

25 bp attB and at the 3 '-end by the 84 bp attP recognition site of C31-Int (Fig.l 
and 6). These attB and attP sites are located on the same DNA molecule and 
oriented in a way to each other which allows the deletion of the flanked DNA 
segment. The same orientation of attB and attP sites is used naturally by the 
0>C31 phage and the bacterial genome, leading to the integration of the phage 

30 genome when both sites are located on different DNA molecules (Thorpe et al., 
Proc. Natl. Acad. Sci. USA, 95, 5505 - 5510 (1998)). To measure C31-Int and 
Cre recombinase activities with the same substrate vector, pRK64 contains in 
addition two Cre recognition (loxP) sites in direct orientation next to the att sites. 
Since the att/lox-flanked cassette in plasmid pRK64 is inserted between the SV40 

35 promoter and the coding region of the B-galactosldase gene, its presence inhibits 
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B-galactosidase expression as the SV40 promoter derived transcripts are 
terminated at the polyadenylation signal of the puromycin gene. Plasmid pRK64 
is turned into a 6-galactosidase expression vector upon C3i-Int or Cre mediated 
deletion of the att/lox-ffanked puromycin cassette since the remaining single att 
5 and loxP site do not substantially interfere with gene expression. 

For the expression of recombinases a mammalian expression vector was 
designed which contains the CMV immediate early promoter followed by a hybrid 
intron, the coding region of the recombinase to be tested, and an artificial 
polyadenylation signal sequence. The backbone sequence of the four 
recombinase expression vectors shown in Figure 1A-D is identical to each other 
except for the recombinase coding region. Plasmid pCMV-C31Int(wt) (Fig. 1A) 
contains the nonmodified (wildtype) 1.85 kb coding region of C31-Int as found in 
the genome of phage ©C31 (Kuhstoss, et al,, J. Mol. Biol. 222, 897-908 (1991)). 
Plasmid pCMV-C31Int(NNLS) (Fig. IB) contains a modified C31-Int gene coding 
for the full length C31-Int protein with a N-terminal extension of 7 amino acids 
derived from the SV40 virus large T antigen which serves as a nuclear 
localisation signal (NLS), Plasmid pCMV-C31Int(CNLS) (Fig. 1C) contains a C- 
terminai extension of 7 amino acids derived from the SV40 virus large T antigen 
which serves as a nuclear localisation signal (NLS). Plasmid pCMV-Cre (Fig. ID) 
contains the 1.1 kb Cre coding region with an N-terminal fusion of the 7 amino 
acid NLS of the SV40 T-antigen. For Cre recombinase it has been shown that the 
N-terminal addition of the SV40 T-antigen NLS does not increase its 
recombination efficiency in mammalian cells (Le et al., Nucleic Acids Res., 27, 
4703 - 4709 (1999)). 

As a test system to compare the efficiency of the 4 recombinases the same 
amount of plasmid DNA of each of the recombinase expression vectors together 
with a fixed amount of the reporter plasmid pRK64 was transiently introduced 

30 into Chinese Hamster Ovary (CHO) cells. Thus, in this assay design the efficiency 
of the various recombinases on an extrachromosomal substrate introduced into 
the CHO ceils was compared as a circular plasmid. Two days after transfection 
the ceils from the various samples were lysed and the activity of B-galactosidase 
in the lysates was determined by a specific chemiluminescense assay and 

35 expressed in "Relative Light Units" (RLU (I3-Gal)) (Fig. 2). In addition all samples 
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contained a fixed amount of a luclferase expression vector to control for the 
experimental variation of cell transfection and lysis. For this purpose the lysates 
of each sample were also tested for luciferase activity with a specific 
chemiluminescense assay and the vafues expressed as "Relative Light Units" 
5 (RLU (Luciferase)) (Fig. 2). All transfection samples contained in addition varying 
amounts of the unrelated cloning plasmid pUC19 so that all samples were 
equalised to the same amount of plasmid DNA, As a positive control for 
galactosidase a derivative of the recombination reporter pRK64 was used in 
which the loxP flanked 1.1 kb cassette has been removed through Cre mediated 
10 recombination in E. coli giving rise to plasmid pRK64(ACre). As negative controls 
served samples which received the unrecombined reporter plasmid pRK64 but no 
recombinase expression vector as well as samples set up with the pUC19 plasmid 
alone. 

15 To determine the relative efficiency of the tested recombinases the RLU values of 
13-galactosidase were divided individually for each sample by the RLU values 
obtained for luciferase and multiplied with 10 5 . From the values of the four data 
points of each sample the mean value and standard deviation was calculated as 
an indicator of recombinase activity (Gal/Luc) (Fig, 2). The relative activity of the 

20 tested recombinases was then compared to the positive control defined as an 
activity of 1. 

As shown in Fig. 2, the expression of Cre recombinase (samples 10 and 11) 
resulted in a 150 to 170-fold increase of B-galactosidase activity as compared to 

25 the negative control (sample 3), demonstrating the wide dynamic range of our 
test system. Each recombinase vector was tested using two different amounts of 
DNA for transfection (0.5 and lng/sample), which in the case of Cre resulted in 
63% and 72% recombinase activity (samples 10 and 11 as compared to the 
positive control). These two values establish that the DNA amounts used are 

30 close to the test systems saturation for recombinase expression as the doubling 
of DNA amounts resulted only in a minor increase of recombinase activity. 

In comparison to Cre, the expression of wildtype C31-Int resulted in a 
considerably lower recombinase activity of 23% and 30% (Fig. 2, samples 4 and 
35 5) as compared to the positive control. These values represent 37% and 42% 
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recombinase activity for wild type C31-Int as compared to Cre recombinase 
(compare samples 4 and 5 with 10 and 11). Upon the expression of C31-Int 
fused with the N-tenminal NLS (C31-Int(NNLS)) values of 32% and 36% 
recombinase activity (samples 6 and 7) were obtained as compared to the 
5 positive control. The C31-Int(NIMLS) values represent 51% and 50% recombinase 
activity as compared to Cre (compare samples 6 and 7 to 10 and 11). Thus, the 
activity of C31-Int in mammalian cells is just moderately enhanced by the 
addition of a NLS signal. 

Surprisingly, upon the expression of C31-Int fused with the Oterminal NLS (C31- 
10 Int(CNLS)) values of 50% and 65% recombinase activity (samples 8 and 9) were 
obtained as compared to the positive control. The C31-Int(CNLS) values 
represent 79% and 90% recombinase activity as compared to Cre recombinase 
(compare samples 84 and 9 to 10 and 11), Unexpectedly,C31-Int(CNLS) exhibits 
a dramatic, more than twofold increase of recombinase activity in comparison to 
15 C31-Int(wt) (compare samples 8 and9 to 4 and 5), 

In order to test whether the addition of a NLS sequence may be a genera I, 
simple method to enhance recombinase activity in mammalian cells we extended 
our studies by two additional recombinases: XisA recombinase (XisA) derived 

20 from the cya no bacterium Anabaena, and SSV-In teg rase (SSV-Int) derived from 
the SSV1 virus of the thermophilic bacterium Sulfolobus shibatae. To this end we 
constructed mammalian expression vectors for the wildtype forms of XisA and 
SSV recombinases and compared their activity to versions which were modified 
by the N-terminal addition of the 7 amino acid NLS of the SV40 T-antigen. These 

25 recombinases were compared by the use of the reporter vector shown in Fig -IE, 
except that the att elements of C31-Int were replaced by the riif recognition 
sequences for XisA or the att sequences for SSV-Int, As described above for C31- 
Int, recombinase activities were tested by transient transfection into CHO cells 
using the reporter vector derived p-gaiactosidase activity as readout and 

30 cotransfected luciferase as internal control. 

As shown in Fig. 3 for both, XisA and SSV recombinases the addition of a NLS 
sequence did not improve their activity in a mammalian cell line as compared to 
the wildtype forms. At both DNA concentrations tested wildtype XisA exhibits a 
significant recombination activity as compared to the reporter vector alone 

35 (compare samples 2 and 3 to sample 1), but this activity is not altered by the 
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addition of an NLS (compare samples 2 and 3 to samples 4 and 5). SSV-Int 
exhibits only a low recombination activity (compare samples 7 and 8 with sample 
6) which is also not enhanced by the addition of a NLS (compare samples 9 and 
10 with samples 7 and 8). From these results we conclude that the addition of a 
5 NLS to an inefficient recombinase is not a general, simple method to improve its 
performance in mammalian cells. 

Taken together, in the transient transfection test system shown in Figure 2 a 
more than twofold activity increase of the OC31 Integrase could be achieved by 
the C-terminal, but not the N-termlnal addition of the SV40 T antigen NLS signal, 
As this signal sequence has been characterised to act as a nuclear localisation 
signal (Kalderon et. al, Cell, 39, 499 - 509 (1984)) we conclude that the 
efficiency increase of C31-Int(CNLS) is the result of the improved nuclear 
accumulation of this recombinase. The relative inefficiency of C31-Int (NNLS) 
may be explained by the inaccessibility of the NLS peptide to the nuclear import 
machinery at the N-terminal position of the C31-Int protein. 

In particular, It could be shown that C31-Int(CNLS) recombfnes 
extrachromosomal DIMA in mammalian cells almost as efficient as the widely used 
Cre recombinase and thus provides an additional or alternative recombination 
system of highest activity. The efficiency increase of C31-Int(CNLS) as compared 
to Its wildtype form is regarded as an invention of substantial use for 
biotechnology. 

Example 2 

25 As demonstrated in example 1 C31-Int recombinase with the C-terminal fusion of 
the SV40 T-antigen NLS (C31-Int(CNLS)) shows in mammalian cells a 
recombination activity comparable to Cre recombinase on an extrachromosomal 
plasmid vector. It was further tried to test whether C31-Int(CIMLS) exhibits a 
similar activity on a recombination substrate which is chromosomal ly integrated 

30 into the genome of mammalian cells. This question is critical for the use of a 
recombination system for genome engineering as it is possible that a 
recombinase may act efficiently on extrachromosomal substrates but is impaired 
if the recognition sites are part of the mammalian chromatin. To characterise the 
recombination activity of C31-Int(CNLS) and C31-Int(NNLS) on a chromosomal 

35 substrate the pRK64 reporter plasmid (Fig. IE) was stably integrated, containing 
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a pair of JoxP and att sites, into the genome of a mammalian cell line. One of the 
stable transfected clones was chosen for further analysis and was transiently 
transfected with recombinase expression vectors coding for C31-Int(CNLS), C31- 
Int(NNLS), C31-Int(wt) or Cre recombinase. The activity of B-galactosfdase 
5 derived from the Cre expression vector recomblned in these celfs was taken as a 
measure of recombination efficiency. 



A. Plasmid constructions: all plasmids used and their purification are described in 
example 1. 

10 

B. Cell culture and transfections: To generate a stably transfected C31-Int 
reporter cell line 2.5 x 10 6 NIH-3T3 cells (Andersson et aL, Cell, 16, 63-75 
(1979); DSM2#ACC59; DSMZ-Deutsche Sammlung von Mikroorganismen und 
Zellkulturen GmbH, Mascheroder Weg lb, D-38124 Braunschweig, Germany) 

15 were el ectropo rated with 5 pg pRK64 plasmid DNA linearised with Seal and 
plated into 10cm petri dishes. The ceils were grown in DMEM/Glutamax medium 
(Life Technologies) supplemented with 10% fetal calf serum at 37°C, 10% C0 2 in 
humid atmosphere, and passaged upon trypsinisation. Two days after tranfection 
the medium was supplemented with lmg/ml of puromycin (Calbiochem) for the 

20 selection of stable integrants. Upon the growth of resistant colonies these were 
isolated under a stereo microscope and individually expanded in the absence of 
puromycin, To demonstrate stable integration of the transfected vector, genomic 
DNA of puromycin resistant clones was prepared according to standard methods 
and 5-10 pg were digested with EcoRV. Digested DNA was separated in a 0,8% 

25 agarose gel and transferred to nylon membranes (GeneScreen Plus, NEN 
DuPont) under alkaline conditions for 16 hours. The filter was dried and 
hybridised for 16 hours at 65 D C with a probe representing the 5" part of the E. 
coli B-galactosidase gene (1.25 kb Notl - EcoRV fragment of plasmid CMV-B-pA 
(R- KCihn, unpublished). The probe was radiolabeled with P32-marked a-dCTP 

30 (Amersham) using the Megaprime Kit (Amersham). Hybridisation was performed 
in a buffer consisting of 10% dextranesulfate, 1% SDS, 50 mM Tris and 100 mM 
NaCI, pH7.5), After hybridisation the filter was washed with 2x SSC/1%SDS and 
exposed to BioMax MSI X-ray films (Kodak) at - 80°C. 

Transfections of the selected done 3T3(pRK64)~3 with plasmid DNAs and the 
35 measurement of B-galactosidase activities in lysates were essentially performed 
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as described In example 1 for CHO cells, except that 32ng or 64ng of the 
recombinase expression plasmids and 218 or 186 ng of pUC19 plasmld were 
used and the pRK64 plasmid was omitted from all samples. 

5 C, Histochemical detection of B-oalactosidase activity in transfected 3T3fpRK64V 
3 cells 

To directly demonstrate B-galactosidase expression in recombinase transfected 
cells, 10 4 3T3(pRK64)-3 ceils were plated one day before transfection into each 
well of a 48-well tissue culture plate (Falcon)- For the transient transfection of 

10 cells with plasmids each well received into 250 pi of medium a total amount of 
150 ng supercoiled plasmid DISIA complexed before with the FuGene6 transfection 
reagent (Roche Diagnostics GmbH, Mannheim, Germany) according to the 
manufacturers protocol. Each 150 ng DNA preparation contained 50 ng of the 
recombinase expression vector pCMV-Cre or pCMV-C31Int(CNLS) and lOOng of 

15 the pUC19 plasmid. After 2 days the culture medium was removed from the 
wells, the wells were washed once with phosphate buffered saline (PBS), and the 
cells were fixed for 5 minutes at room temperature in a solution of 2% 
formaldehyde and 1% glutaraldehyde in PBS. Next, the cells were washed twice 
with PBS and finally incubated in X-Gal staining solution for 24 hours at 37°C 

20 (staining solution: 5 mM K 3 (Fe(CN) 6 ), 5 mM K4(Fe(CN) 6 ), 2 mM MgCI 2 , lmg/ml 
X-Gal (BioMol) in PBS) until photographs were taken. 



D. Results 

To generate a mammalian cell clone with a stable genomic integration of the 
25 C31-Int and Cre recombinase reporter plasmid pRK64, the murine fibroblast cell 
line NIH-3T3 was electroporated with linearised pRK64 DNA (Fig. ID; see also 
example 1) and subjected to selection in puromycin containing growth medium. 
Plasmid pRK64 contains in between the pair of loxP and att sites the coding 
region of the puromycin resistance gene expressed from the SV40-IE promoter. 
30 Thirty-six puromycin resistant clones were isolated and the genomic DNA of 19 
clones was analysed for the presence and copy number of the pRK64 DNA. Three 
clones, which apparently contain 2-4 copies of pRK64, were further 
characterised on the single cell level for the expression of B-galactosidase upon 
transient transfection with the Cre expression vector pCMV-Cre (Fig. 1C). The cell 
35 clone with the largest proportion of B-galactosidase positive cells, 3T3(pRK64)-3, 
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was selected as most useful for the planned studies on C31-Int and Cre 
recombinase efficiency. 

To compare the efficiency of wildtype C31-Int (C31-Int(wt)), C31~Int(CNLS), 
5 C31-Int(NNLS), and Cre recombinases 32ng or 64 ng of the recombinase 
expression vectors pCMV-C31Int(wt), pCMV-C31Int(CNLS), pCMV-C31Int(NNLS), 
or pCMV-Cre (Rig. 1 A-D) together with the unrelated cloning plasmid pUC19 
were transiently introduced into 3T3(pRK64)-3 cells, such that all samples 
contained the same amount of plasmfd DNA. As a negative control a sample 

10 prepared with the pUC19 plasmid alone was used. Two days after transfection 
the cells from the various samples were lysed and the activity of B-galactosidase 
in the lysates was determined by a specific chemiluminescense assay and 
expressed in "Relative Light Units" (RLU)(6-Ga!) (Fig. 4), From the values of the 
four data points of each sample the mean value and standard deviation was 

15 calculated as an indicator of recombinase activity (Fig, 4), The relative activity of 
the tested recombinases was then compared to the highest value obtained with 
the Cre expression vector, defined as an activity of J. 

As shown in Figure 4 the expression of Cre recombinase (samples 8 and 9) 
20 resulted in a 36 to 49-fold Increase of B-galactosidase activity as compared to the 
negative control (sample 1), demonstrating the dynamic range of the test system 
used. Each recombinase vector was tested using two different amounts of DNA 
for transfection (32 ng and 64 ng/sample), which in the case of Cre resulted in 
73% and 100% recombinase activity (samples 8 and 9). These two values 
25 establish that the DNA amounts used are not far from the linear scale of the test 
systems ability for recombinase expression as the twofold increase of the amount 
of DNA also resulted in a significant increase of recombinase activity. 

The expression of wildtype C31-Int (Fig. 4, samples 2 and 3) resulted in a low 
30 recombinase activity of 4% and 10% as compared to thevalues obtained by Cre 
transfection. (compare samples 2 and 3 with 8 and 9), This activity was only 
moderately enhanced by the expression of C31-Xnt(NNLS) to values of 19% and 
22% of Cre activity (compare samples 4 and 5 with samples 8 and 9). Upon the 
expression of C31-Int(CNLS) values of 48% and 78% recombinase activity were 
35 obtained as compared to Cre recombinase (compare samples 6 and 7 to 8 and 
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9). Hence, C31-Int(CNLS) exhibits an 12-fold higher activity than C31-Int(wt) at 
32 ng plasmid DNA (Fig.4, compare samples 6 and 2) and an 8-fold higher 
activity than C31-Int(wt) at 64 ng plasmid DNA (Fig.4, compare samples 7 and 

5 

In addition, it was aimed to directly demonstrate in situ the expression of B- 
galactosidase in 3T3(pRK64>3 ceils after transfection with Cre or C31~Int(CNLS) 
recombinase plasmid , Two days after transfection the cells were fixed in situ and 
incubated with the histochemical X-Gal assay which detects 6-galactosidase 

10 positive cells by a blue precipitate. As shown in Figure 5 stained cells were found 
at a comparable frequency in the samples transfected with the Cre or C31- 
Int(CNLS) expression vectors but not in the non transfected control- This result 
confirms that the B-galactosidase activities measured by chemiluminescense 
upon recombinase transfection (Fig. 4) results from a population of individual, 

15 recombined reporter eel Is . 

In conclusion, upon the transient transfection of recombinase expression vectors 
Into a cell line with a genomic integration of the recombination substrate vector, 
a 8 - 12 -fold activity increase of the 0>C31 Integrase by the C-terminal fusion 

20 with the SV40 T-antigen NLS signal was found. As this signal sequence has been 
characterised to act as a nuclear localisation signal (Kalderon et. al, Cell, 39, 499 
- 509 (1984)), it was concluded that the dramatic efficiency increase of C31- 
Int(CNLS) is the result of the improved nuclear accumulation of this 
recombinase. The approximately tenfold activity increase of C31-Int(CNLS) upon 

25 expression in a cell line with a genomic integration of the substrate vector is 
considerably higher than the activity increase found upon the transient 
expression of both vectors (see example 1). Thus, a substrate vector integrated 
into the chromatin of a mammalian cell may pose more stringent requirements 
on recombinase activity to be recombined as compared to an extrachromosomai 

30 substrate. 

The dramatic activity increase of C31-Int(CNLS), as compared to its wildtype 
form, on a stable integrated substrate in mammalian cells is an invention of 
significant practical use as this recombinase is as efficient as the widely used 
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Cre/loxP system; thus, C31-Int(CNLS) provides an additional or alternative 
recombination system of highest activity. 

Example 3 

5 To demonstrate that the increase in B-galactosidase activity obtained by the 
cotransfection of a C31-Int expression vector and the reporter vector pRK64 into 
mammalian ceils is in fact the result of recombinase mediated deletion, one of 
the recombination products was detected by a specific polymerase chain reaction 
(PCR). The amplified PCR product was cloned and its sequence determined. The 
10 obtained sequence confirms that C31-Int mediated deletion of the test vector 
occurs in a mammalian cell line and that the recombination occurs at the known 
breakpoint within the attB and attP sites. 



A. Plasmid constructions : The construction of plasmids pRK64, pCMV-Cre and 
15 pCMV-C31-Int(wt) is described |n Example 1. To simulate the recombination of 
pRK64 by C31-Int, the sequence between the CAA motives of the att sites 
(boxed in Fig. 5) was deleted from the computerfile of pRK64, giving rise to the 
sequence of pRK64(AInt) (SEQ ID NO:16). 

20 B. Transfection of Cells and PCR amplification: MEF5-5 mouse fibroblasts 
(Schwenk et al., 1998) (20000 cells per well of a 12 well plate (Falcon)) were 
transfected with 0.5 ug pRK64 alone or together with 250 ng pCMV-Int(wt) or 
pCMV-Cre using the FuGene6 transfection reagent following the manufacturers 
protocol (Roche Diagnostics). After 2 days DNA was extracted from these cells 

25 according to standard methods and used for PCR amplification with Primers P64- 
1 (SEQ ID NO:17; complementary to position 111-135 of pRK64(AInt)) and P64- 
4 (SEQ ID NO: 18; complementary to position 740-714 of pRK64(Alnt)) using the 
Expand High Fidelity PCR kit (Roche Diagnostics). PCR products were separated 
on a 0.8% agarose gel, extracted with the QuiaEx kit (Quiagen) and cloned into 

30 the pCR2.1 vector using the TA cloning kit (Invitrogen) resulting in plasmid 
pRK80d. The sequence of its insert, seq80d (SEQ ID NO: 19), was determined 
using the reverse sequencing primer and standard sequencing methods (MWG 
Biotech). 



35 For the measurement of B-galactosidase activity the cells were lysed 2 days after 
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transfection and the I3-galactosidase activities were determined with the B- 
galactosldase reporter gene assay (Roche Diagnostics) as described in example 
1, 



5 C. Results: As a test vector for C31-Int mediated DMA recombination plasmid 
pRK64 was used, which contains the 1.1 kb coding region of the puromycin 
resistance gene flanked 5' by the 84 bp attB and 3' by the 84 bp attP 
recognition site of C31-Int (Fig. 5). These attB and attP sites are located on the 
same DNA molecule and oriented in a way to each other which allows the 

10 deletion of the att-flanked DNA segment, The same orientation of attB and attP 
sites is used naturally by the <S>C31 phage and the bacterial genome for the 
integration of the phage genome when both sites are located on different DNA 
molecules (Thorpe et al., Proc. Natl. Acad. ScL USA, 95, 5505 - 5510 (1998)). As 
a positive control, vector pRK64 contains in addition two Cre recombinase 

15 recognition (ioxP) sites in direct orientation next to the att sites. Since the att- 
flanked DNA segment in plasmid pRK64 is inserted between a promoter active in 
mammalian cells and the 6-galactosidase gene, its deletion can be measured by 
the increase of G-galactosidase activity. The expected product of C31-Int 
mediated deletion of plasmid pRK64 is shown in Fig* 6, designated as 

20 pRK64(AInt). If the recombination between attB and attP occurs as described in 
bacteria (Thorpe et al., Proc. Natl. Acad. ScL USA, 95, 5505 - 5510 (1998)), a 
single attR site is generated and left on the parental plasmid (Fig. 6) while the 
flanked DNA is excised and contains an attL site. Beside the measurement of B- 
galactosidase activity, C31-Int mediated recombination of pRK64 can be directly 

25 detected on the DNA level by a specific polymerase chain reaction (PCR) using 
the primers P64-1 and P64-4 (Fig. 6), These primers, located 5^ of the attB site 
(P64-1) and 3" of the attP site, are designed to amplify a PCR product of 630 bp 
lenght upon the C31-Int mediated recombination of pRK64, For the expression of 
C31-Int In mammalian cells plasmid pCMV-C31(wt) was used, which contains the 

30 CMV-IE-Promoter upstream of the C31-Int coding region followed by a synthetic 
polyadenyiation signal (see Example 1 and Fig,l). 

The recombination substrate vector pRK64 was transiently transfected into the 
murine fibroblast cell line MEF5-5 either alone, or ^together with the C31-Int 
expression vector pCMV-C31(wt), or together with an expression vector for Cre 
35 recombinase, pCMV-Cre. Two days after transfection half the cells of each sample 
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was lysed and used to measure 13-galactosidase activity by chemiluminescense, 
and the other half was used for the preparation of DNA from the transfected cells 
for PGR analysis. The B-galactosidase levels of the 3 samples were found as 
following (expressed as "Relative Light Units" (RLU) with standard deviation (SD) 
5 of the B-galactosidase assay): 



Sample RLU (SD) 

1) pRK64 692 ± 5 

2) pRK64 + pCMV-Cre. 8527 ± 269 
10 3) pRK64 + pCMV-C31(wt) 1288 + 93 



As the coexpression of the test vector pRK64 together with the C31-Int 
expression vector in sample 3 leads to a significant increase of B-galactosidase 
activity as compared to pRK64 alone, this result suggests that pRK64 is 
15 recombined by C31-Int as anticipated in Fig. 6. 

Next, cellular DNA was prepared from the three samples and tested for the 
occurrence of the expected Cre or C31-Int generated deletion product by PGR 
using primers P64-I and P64-4 for amplification, As shown in Fig. 7 an 
20 amplification product of the expected size was found only in the samples 
cotransfected with the Cre or C31»Int recombinase expression vectors (Fig. 7 A, 
Iane3 and lane 4). The PCR products amplified from pRK64 recombined by C31- 
Int or Cre are of the same size but should be recombined via the attB/P or loxP 
sites, respectively. 

25 To prove that the PCR product found after cotransfection of plasmids pRK64 and 
pCMV-C31(wt) represents in fact the deletion product of C31-Int mediated 
recombination, this DNA fragment was cloned into a plasmid vector and its DNA 
sequence determined. One clone, pRK80d, was analysed, and its sequence 
showed exactly the sequence of an attR site as expected from C31-Int mediated 

30 deletion of pRK64 (Fig. 7B, compare to Fig. 6). 



In conclusion, this experiment demonstrates that C31-Int mediated deletion of a 
vector containing a pair of attB/attP sites occurs In a mammalian cell line, and 
that the recombination occurs within the same 3 bp breakpoint region of attB and 
35 attP as found in bacteria (Thorpe et aL, Proc. Natl. Acad- Sci. USA, 95, 5505 - 
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5510 (1998)). Thus, it was concluded that an increase of B-galactosidase activity 
observed by cotransfection of the pRK64 reporter vector and a C31-Int 
expression vector in mammalian cells truly reflects C31-Int recombinase activity. 

5 

Example 4 

As has been demonstrated in examples 1-3, the C31-Int recombinase with the O 
terminal fusion of the SV40 T-antigen NLS (C31-Int(CNLS)) shows a 

10 recombination activity comparable to Cre recombinase on an extrachromosoma! 
as well as a chromosomally integrated target In mammalian celis in vitro. To test 
whether C31-Int(CNLS) exhibits activity in mice, transgenic mice carrying a C31- 
Int(CNLS) expression vector were generated. These transgenic mice were 
crossed with reporter mice carrying the recombinase substrate. Recombination- 

15 mediated expression of p-galactosidase, which can be measured by staining with 
the substrate X-Gal, was analyzed in testes of double transgenic progeny 
carrying both the recombinase and the reporter. 

A. Plasmid constructions: For the construction of the C31-Int(CNLS) transgene 
20 expression vector, the C31Int gene with C-terminal NLS was isolated as a 2 kb~ 

fragment generated by restriction of pCMV-C31Int(CNLS) (SEQ ID NO: 12) with 
Bglll. The fragment was ligated into the Bglll restriction site of the vector 
pCAGGS-Cre-pA (SEQ ID NO:104) giving rise to the plasmid pCAGGS-C31CNLS- 
pA (SEQ ID NO: 105). In pCAGGS-C31CNLS-pA the C31-Int(CNLS) (position 
25 1891-3753) is transcribed from the CAGGS promoter (position 1-1616) and 
followed by the SV40 late region poiyadenylation sequence (position 3763-3941), 

B. Production of transgenic mice: For the embryo Injections a 3.95 kb-fragment 
was generated by restriction of the plasmid pCAGGS~C31CNLS-pA with PstI and 

30 AscL This fragment was purified as follows: DNA bands were separated on an 
agarose-gel without ethidiumbromlde. One part of the gel was stained with 
ethidium bromide to locate the band to excise. The DNA was electroeiuted from 
the excised band with S&S Biotrap Elutlon Chamber in lx TAE (40 mM Tris- 
acetate, 1 mM EDTA) overnight. The DNA was precipitated from the eluate with 

35 1/10 volume 3M sodium acetate and 2.5 volumes ethanol at -20 °C for several 
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hours. The DNA was pelleted by centrlfugation at 13000 rpm for 30 min and 
washed twice with 70 % ethanol. The dried DNA pellet was resuspended in TE 
(10 mM Tris, 1 mM EDTA, pH 8). Subsequently the precipitation procedure was 
repeated once and the DNA resuspended in injection buffer (10 mM Tris pH 7.2, 
5 0.1 mM EDTA ). The sample was dialysed with Slide-A-Lyse Mini Dialysis Unit 
(Pierce) in injection buffer with several changes of buffer at 4°C overnight. 
Different amounts of the sample were checked on a gel to determine 
concentration. To generate transgenic mice, 5-10 fg of the purified fragment 
were injected into one pronucleus of (B6CBA)F2 mouse one-cell embryos. The 
10 injected embryos were subsequently transferred into the oviduct of 0.5 day 
pseudopregnant NMRI females. 



C. Analysis of transgenic mice: Mice were analyzed for the presence of the 
pCAGGS-C31CNLS-pA transgene by PCR using tail DNA and the primers C31- 

15 screen 1 (SEQ ID NO:100) and C31-screen 2 (SEQ ID NO:101) amplifying a 
fragment of 500 bp. The PCR reaction contained 5 ul PCR buffer (Invitrogen), 2 
pi 50 mM MgC! 2 , 1.5 pi 10 mM dNTP-mix, 2 pi (10 pmoi) of each primer, 0.5 pi 
Taq-polymerase (5 U/ pi) and water to a volume of 50 pi. The program used for 
the PCR reactions was: 94 °C for 30 s, 55 °C for 30 s and 72 °C for 1 min in 30 

20 cycles. 

D. Analysis of C31-IntfCNLS^ activity: Founder mice transgenic for the pCAGGS- 
C31CNLS-pA transgene were crossed to heterozygous C31 reporter mice carrying 
the C31 reporter construct in the ROSA26 locus (SEQ ID NO: 106) (Fig. 8). 

25 Offspring of the crosses were genotyped for the presence of the pCAGGS- 
C31CNLS-pA transgene by the PCR assay described in section C as well as for the 
ROSA26-C31 reporter allele by a LacZ-speclfic PCR assay. The PCR was 
performed using tail DNA and the primers p-Gal 3 (SEQ ID NO: 102) and p-Gal 4 
(SEQ ID NO:103) amplifying a fragment of 315 bp. The PCR reaction contained 5 

30 pi PCR buffer (Invitrogen), 2.5 pi 50 mM MgCI 2 , 2 pi 10 mM dNTP-mix, 1 pi (10 
pmol) of each primer, 0.4 pi Taq-polymerase (5 U/ pi) and water to a volume of 
50 pi. The program used for the PCR reactions was: 94 °C for 1 min, 60 °C for 1 
min and 72 °C for 1 min in 30 cycles. 

Testes from mice carrying the pCAGGS-C31CNLS-pA transgene as well as the 
35 reporter locus and from a control mouse carrying the reporter allele only were 
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dissected. The tissues were imbedded in OCT Tissue freezing medium 
(Leica/Jung) and frozen in liquid nitrogen- Cryosections were generated from the 
embedded tissues using a Leica CM3050 cryo microtome, dried on polylysine- 
. coated slides for 1-4 hours and then stained as follows: Sections were fixed in 
5 0.2 % gfutaraldehyde, 5mM EGTA, 2 mM MgCI 2 in 0.1 M PB (K 2 HP(V KH 2 P0 4 , pH 
7.3) for 5 min at room temperature and washed in wash buffer (2 mM MgCI 2 , 
0.02 % Nonidet-40 in PB in 0.1 M PB) 3 times for 15 min. Then sections were 
stained in X-Gal-solution (0,6 mg/ ml X-Gal in DNISO, 5 mM potassium 
hexacyarioferrat III, 5 mM potassium hexacyanoferrat II in LacZ wash buffer) 
10 overnigth at 37 °C. After staining sections were washed in lx PBS twice for 5 
min. Dehydration was performed by washing the sections first with 70 %, 96 % 
and 100 % ethanol for 2 min each, then with a 1:1 mix of ethanol and xylol for 5 
min and in the end only with xylol for 5 min. Before taking pictures sections were 
mounted in Entellan. 

15 

E. Results: T o identify transgenic founder mice carrying the pCAGGS-C31CNLS- 
pA transgene, 29 mice born from the injection experiment were analyzed for the 
presence of the transgene. 5 founder mice (3 females and 2 males) were 

20 identified. To analyze the activity of the C31-Int(CNLS) recombinase in 
transgenic mice, 2 of the female founder mice were crossed to heterozygous C31 
reporter mice carrying a C31 reporter construct in the ROSA26 locus (Fig. 8). 
From each of these crosses, one offspring carrying the pCAGGS-C31CNLS-pA 
transgene as well as the C31 reporter allele was sacrificed. In oder to determine 

25 whether pCAGGS-C31CNLS-pA transgenic mice are able to delete an attB/P 
flanked DNA sequence in the mouse germfine, tissue sections from the testes of 
the sacrificed animals were prepared and stained for 6-galactosIdase activity with 
X-Galp Fig* 9 shows the result of the staining experiment for one of these mice 
(A) as well as a control mouse carrying only the reporter allele, but lacking the 

30 pCAGGS-C31CNLS-pA transgene (B), Clear staining can be detected in the 
maturing sperm cells in about 50% of the tubules with the proportion of 6- 
galactosidase expressing cells ranging between 10 and 100. No staining could be 
detected for the control mouse. This clearly demonstrates that C31-int-mediated 
recombination has taken place during spermatogenesis in the pCAGGS-C31CNLS- 

35 pA transgenic mice. These results show that the C31-Int is functional in vivo, in a 
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transgenic mouse system and therefore provides a new tool to introduce specific 
deletions, inversions or Integrations into the mouse germline. 
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1. A fusion protein comprising 

(a) a recombinase domain comprising a recombinase protein or or a mutant 
5 thereof having a recombinase activity similar to that of the corresponding wild- 

type recombinase and 

(b) a signal peptide domain (inked to said recombinase domain which directs 
nuclear import of said fusion protein in eucaryotic cells. 

10 2. The fusion protein of claim i, wherein the activity of the fusion protein in 
eucaryotic cells is significantly higher as compared to that of the wild -type 
recombinase corresponding to the recombinase of the recombinase domain. 

3. The fusion protein of claim 1 or 2, wherein the recombinase domain comprises 
15 a recombinase protein belonging to the family of large serine recombinases or a 

mutant thereof, preferably the recombinase domain comprises a recombinase 
protein selected from the group consisting of bacteriophage <X>C31 integrase 
(C31-Int), coliphage P4 recombinase, Listeria phage recombinase, bacteriophage 
R4 Sre recombinase, CisA recombinase, XlsF recombinase, transposon Tn4451 
20 TnpX recombinase and lactococcal bacteriophage TP901-1 recombinase, or a 
mutant thereof; most preferably the recombinase protein is a C31-Int protein or 
a mutant thereof. 

4. The fusion protein of claim 3, wherein the recombinase protein comprises a 
25 C31-Int having the amino acid sequence shown in SEQ ID NO:21 or a Oterminal 

truncated form thereof, said truncated form of the C31-Int preferably comprising 
amino acid residues of 306 to 613 of SEQ ID NO:21- 

5. The fusion protein according to any one of claims 1 to 4, wherein the signal 
30 peptide domain is derived from yeast GAL4, SKI3, L29 or histone H2B proteins, 

polyoma virus large T protein, VP1 or VP2 capsid protein, SV40 VP1 or VP2 capsid 
protein, adenovirus Ela or DBP protein, influenza virus NS1 protein, hepatitis 
virus core antigen or the mammalian lamin, c-myc, max, c-myb, p53, c~erbA, 
jun, Tax, steroid receptor or Mx proteins, SV40 T-antigen or other proteins with 
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known nuclear localisation, preferably the signal peptide domain comprises a 
peptide which is derived from the SV40 T-antigen. 

6. The fusion protein according to any one of claims 1 to 5, wherein the signal 
5 peptide domain J * 

(i) has a length of 5 to 74, preferably 7 to 15 amino acid residues, and/or 

(ii) comprises a segment of 6 amino acid residues having at least 2 positively 
charged basic amino acid residues, said basic amino acid residues being 
preferably selected from lysine, arginine and histidine. 

10 

7. The fusion protein of claim 5 or 6, wherein the signal peptide domain 
comprises a peptide selected from a sequence shown in SEQ ID NOs:24 to 53, 
preferably the signal peptide comprises the amino acid sequence Pro~Lys-Lys- 
Lys-Arg-Lys-Val (SEQ ID NO:53). 

15 

8. The fusion protein according to any one of claims 1 to 6, wherein 

(i) the signal peptide domain is linked to the N-terminal or Oterminal of the 
recombinase domain or is integrated into the recombinase domain, preferably the 
signal peptide domain is linked to the C-terminai of the recombinase domain; 

20 and/or 

(ii) the signal peptide domain is linked to the recombinase domain directly or 
through a linker peptide, said linker preferably having 1 to 30 essentially neutral 
amino acid residues. 

25 9, The fusion protein of claim 1 comprising the amino acid sequence shown in 
SEQ ID NO:23. 

10, A DNA coding for the fusion protein according to any one of claims 1 to 9. 

30 11, A vector containing the DNA as defined in claim 10, 

12. A microorganism containing the DNA of claim 10 and/or the vector of claim 
11. 
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13. A process for preparing the fusion protein as defined in any one of claims 1 to 
9 which comprises culturing a microorganism as defined in claim 11 under 
conditions suitable for expression of said fusion protein and recovering said 
fusion protein. 

5 

14. Use of the fusion protein as defined in any one of claims 1 to 9 to recombine 
DNA molecules, which contain recombinase recognition sequences for the 
recombinase protein of the recombinase domain, in eucaryotic cells. 

10 15. A cell, preferably a mammalian cell containing the DNA sequence of claim 10 
in its genome. 

16. The cell of claim 15, also containing recognition sequences for the 
recombinase protein of the recombinase domain in its genome. 

15 

17. Use of the cell of claim 15 or 16 for studying the function of genes and for 
the creation of transgenic organisms. 

18. A transgenic organism, preferably a transgenic non-human mammal 
20 containing the DNA sequence of claim 10 in its genome. 

19. The transgenic organism of claim 18 also containing recognition sequences 
for the recombinase protein of the recombinase domain in its genome. 

25 20. Use of the transgenic organism of claim 18 or 19 for studying gene function 
at various developmental stages. 

21. A method for recombining DNA molecules of cells or organisms containing 
recombinase recognition sequences for the recombinase protein of the 
30 recombinase domain as defined in claims 1 to 9, which method comprises 
supplying the cells or organisms with a fusion protein as defined in claims 1 to 9 
or with a DNA sequence of claim 10 and/or a vector of claim 11 which are 
capable of expressing said fusion protein in the cell or organism. 

35 22. A method for recombining a DNA molecule containing recognition sequences 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

46 

for a recombinase protein in a eucaryotic cell, said method comprising contacting 
the cell with a fusion protein according to claim 1 that recognizes said recognition 
sequences, wherein the fusion protein catalyzes recombination of the DNA 
molecule. 

5 

23. The fusion protein according to any one of claims 1 to 9 which catalyzes 
recombination at recognition sequences for the recombinase protein. 

24 A transgenic organism, preferably a transgenic non-human mammal, 
10 comprising a cell containing a DNA sequence coding for a recombinase fusion 
protein as defined in claims 1 to 9 and 23 in its genome. 
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SEQUENCE LISTING 

<110> Artemis Pharmaceuticals GmbH 

<120> Modified Recombinase 

<130> 012787wo/JH/ml 

10 <140> 
<141> 

<160> 108 

15 <170> Patentln Ver. 2.1 

<210> 1 
<211> 86 
<212> DMA 
20 " <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer C31~4 
25 <4 00> 1 

cgtgacggtc tcgaagccgc ggtgcgggtg ccagggcgtg cccttgggct ccccgggcgc 60 
gtactccacc tcacccatct ggtcca 8 6 



30 <210> 2 
<211> 86 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: primer C31-5 
<400> 2 

cgtggaccag atgggtgagg tggagtacgc gcccggggag cccaagggca cgccctggcc 60 
40 cacgcaccgc ggcttcgaga ccgtca 86. 



<210> 3 
<211> 90 
45 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer C31-6 

50 

<400> 3 

gatccagaag cggttttcgg gagtagtgcc ccaactgggg taacctttga gttctctcag 60 
ttgggggcgt agggtcgccg acatgacacg 90 

55 

<210> 4 
<211> 90 
<212> DNA 

<213> Artificial Sequence 

60 

. <220> 

<223> Description of Artificial Sequence: primer C31-7-2 
<400> 4 

65 gatccgtgtc atgtcggcga ccctacgccc ccaactgaga gaactcaaag gttaccccag 60 
ttggggcact actcccgaaa accgcttctg 90 
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<210> 5 
<211> 7438 
<212> D1SIA 
5 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence; vector pRK64 
10 <400> 5 

cgtcatcacc gaaacgcgcg aggcagctgt ggaatgtgtg tcagttaggg tgtggaaagt 60 
ccccaggctc cccagcaggc agaagtatgc aaagcatgca tctcaattag tcagcaacca 120 
ggctccccag caggcagaag tgtgcaaagc atgcatctca attagtcagc aaccatagtc 180 
ccgcccctaa ctccgcccat cccgccccta actccgccca gttccgccca ttctccgccc 240 

15 catggctgac taattttttt tatttatgca gaggccgagg ccgcctcggc ctaggaacag 300 
tcgacgacac tgcagagacc tacttcacta acaaccggta cagttcgtgg accagatggg 360 
tgaggtggag tacgcgcccg gggagcccaa gggcacgccc tggcacccgc accgcggctt 420 
cgagaccgtc acgaataact tcgtatagca tacattatac gaagttataa gcttgcatgc 480 
ctgcaggtcg gccgccacga ccggccggcc ggtgccgcca ccatcccctg acccacgccc 540 

20 ctgacccctc acaaggagac gaccttccat gaccgagtac aagcccacgg tgcgcctcgc 600 
cacccgcgac gacgtccccc gggccgtacg caccctcgcc gccgcgttcg ccgactaccc 660 
cgccacgcgc cacaccgtcg acccggaccg ccacatcgag cgggtcaccg agctgcaaga 720 
actcttcctc acgcgcgtcg ggctcgacat cggcaaggtg tgggtcgcgg acgacggcgc 780 
cgcggtggcg gtctggacca cgccggagag cgtcgaagcg ggggcggtgt tcgccgagat 840 

25 cggcccgcgc atggccgagt tgagcggttc ccggctggcc gcgcagcaac agatggaagg 900 
cctcctggcg ccgcaccggc ccaaggagcc cgcgtggttc ctggccaccg tcggcgtctc 960 
gcccgaccac cagggcaagg gtctgggcag cgccgtcgtg ctccccggag tgg'aggcggc 1020 
cgagcgcgcc ggggtgcccg ccttcctgga gacctccgcg ccccgcaacc tccccttcta 1080 
cgagcggctc ggcttcaccg tcaccgccga cgtcgagtgc ccgaaggacc gcgcgacctg 1140 

30 gtgcatgacc cgcaagcccg gtgcctgacg cccgccccac gacccgcagc gcccgaccga 1200 
aaggagcgca cgaccccatg gctccgaccg aagccgaccc gggcggcccc gccgaccccg 1260 
cacccgcccc cgaggcccac cgactctaga ggatcataat cagccatacc acatttgtag 1320 
aggttttact tgctttaaaa aacctcccac acctccccct gaacctgaaa cataaaatga 1380 
atgcaattgt tgttgttaac ttgtttattg cagcttataa tggttacaaa taaagcaata 1440 

35 gcatcacaaa tttcacaaat aaagcatttt tttcactgca ttctagttgt ggtttgtcca 1500 
aactcatcaa tgtatcttat catgtctgga tccgtgtcat gtcggcgacc ctacgccccc 1560 
aaatgagaga actcaaaggt taccccagtt ggggcactac tcccgaaaac cgcttctgga 1620 
tccataactt cgtatagcat acattatacg aagttatacc gggccaccat ggtcgcgagt 1680 
agcttggcac tggccgtcgt tttacaacgt cgtgactggg aaaaccctgg cgttacccaa 1740 

40 cttaatcgcc ttgcagcaca tccccctttc gccagctggc gtaatagcga agaggcccgc 1800 
accgatcgcc cttcccaaca gttgcgcagc ctgaatggcg aatggcgctt. tgcctggttt 18 60 
ccggcaccag aagcggtgcc ggaaagctgg ctggagtgcg atcttcctga ggccgatact 1920 
gtcgtcgtcc cctcaaactg gcagatgcac ggttacgatg cgcccatcta caccaacgta 1980 
acctatccca ttacggtcaa tccgccgttt gttcccacgg agaatccgac gggttgttac 2040 

45 tcgctcacat ttaatgttga tgaaagctgg ctacaggaag gccagacgcg aattattttt 2100 
gatggcgtta actcggcgtt tcatctgtgg tgcaacgggc gctgggtcgg ttacggccag 2160 
gacagtcgtt tgccgtctga atttgacctg agcgcatttt tacgcgccgg agaaaaccgc 2220 
ctcgcggtga tggtgctgcg ttggagtgac ggcagttatc tggaagatca ggatatgtgg 2280 
cggatgagcg gcattttccg tgacgtctcg ttgctgcata aaccgactac acaaatcagc 2340 

50 gatttccatg ttgccactcg ctttaatgat gatttcagcc gcgctgtact ggaggctgaa 2400 
gttcagatgt gcggcgagtt gcgtgactac ctacgggtaa cagtttcttt atggcagggt 24 60 
gaaacgcagg tcgccagcgg caccgcgcct ttcggcggtg aaattatcga tgagcgtggt 2520 
ggttatgccg atogcgtcac actacgtctg aacgtcgaaa acccgaaact gtggagcgcc 2580 
gaaatcccga atctctatcg tgcggtggtt gaactgcaca ccgccgacgg cacgctgatt 2640 

55 gaagcagaag cctgcgatgt cggtttccgc gaggtgcgga ttgaaaatgg tctgctgctg 2700 
ctgaacggca agccgttgct gattcgaggc gttaaccgtc acgagcatca tcctctgcat 27 60 
ggtcaggtca tggatgagca gacgatggtg caggatatcc tgctgatgaa gcagaacaac 2820 
tttaacgccg tgcgctgttc gcattatccg aaccatccgc tgtggtacac gctgtgcgac 2880 
cgctacggcc tgtatgtggt ggatgaagcc aatattgaaa cccacggcat ggtgccaatg 2940 

60 aatcgtctga ccgatgatcc gcgctggcta ccggcgatga gcgaacgcgt aacgcgaatg 3000 
gtgcagcgcg atcgtaatca cccgagtgtg atcatctggt cgctggggaa tgaatcaggc 30 60 
cacggcgcta atcacgacgc gctgtatcgc tggatcaaat ctgtcgatcc ttcccgcccg 3120 
gtgcagtatg aaggcggcgg agccgacacc acggccaccg atattatttg cccgatgtac 3180 
gcgcgcgtgg atgaagacca gcccttcccg gctgtgccga aatggtccat caaaaaatgg 3240 

65 ctttcgctac ctggagagac gcgcccgctg atcctttgcg aatacgccca cgcgatgggt 3300 
aacagtcttg gcggtttcgc taaatactgg caggcgtttc gtcagtatcc ccgtttacag 3360 
ggcggcttcg tctgggactg ggtggatcag tcgctgatta aatatgatga aaacggcaac 3420 
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ccgtggtcgg cttacggcgg tgattttggc 
aacggtctgg tctttgccga ccgcacgccg 
cagcagtttt tccagttccg tttatccggg 
ttccgtcata gcgataacga gctcctgcac 
5 gcaagcggtg aagtgcctct ggatgtcgct 
gaactaccgc agccggagag cgccgggcaa 
aacgcgaccg catggtcaga agccgggcac 
gaaaacctca gtgtgacgct ccccgccgcg 
gaaatggatt tttgcatcga gctgggtaat 
10 tttctttcac agatgtggat tggcgataaa 
ttcacccgtg caccgctgga taacgacatt 
aacgcctggg tcgaacgctg gaaggcggcg 
cagtgcacgg cagatacact tgctgatgcg 
catcagggga aaaccttatt tatcagccgg 
15 atggcgatta ccgttgatgt tgaagtggcg 
ctgaactgcc agctggcgca ggtagcagag 
gaaaactatc ccgaccgcct tactgccgcc 
gacatgtata ccccgtacgt cttcccgagc 
ttgaattatg gcccacacca gtggcgcggc 

20 caacagcaac tgatggaaac cagccatcgc 
ctgaatatcg acggtttcca tatggggatt 
tcggcggaat tccagctgag cgccggtcgc 
taataataac cgggcagggg ggatctttgt 
attggacaaa ctacctacag agatttaaag 

25 ataatgtgtt aaactactga ttctaattgt 
tgatgaatgg gagcagtggt ggaatgccag 
ttggacaaac cacaactaga atgcagtgaa 
ctattgcttt atttgtaacc attataagct 
ttcattttat ^tttcaggtt cagggggagg 

30 tctacaaatg tggtatggct gattatgatc 
ttttataggt taatgtcatg ataataatgg 
gaaatgtgcg cggaacccct atttgtttat 
tcatgagaca ataaccctga taaatgcttc 
ttcaacattt ccgtgtcgcc cttattccct 

35 ctcacccaga aacgctggtg aaagtaaaag 
gttacatcga actggatctc aacagcggta 
gttttccaat gatgagcact tttaaagttc 
acgccgggca agagcaactc ggtcgccgca 
actcaccagt cacagaaaag catcttacgg 

40 ctgccataac catgagtgat aacactgcgg 
cgaaggagct aaccgctttt ttgcacaaca 
gggaaccgga gctgaatgaa gccataccaa 
caatggcaac aacgttgcgc. aaactattaa 
aacaattaat agactggatg gaggcggata 

45 ttccggctgg ctggtttatt gctgataaat 
tcattgcagc actggggcca gatggtaagc 
ggagtcaggc aactatggat gaacgaaata 
ttaagcattg gtaactgtca gaccaagttt 
ttcattttta atttaaaagg atctaggtga 

50 tcccttaacg tgagttttcg t.tccactgag 
cttcttgaga tccttttttt ctgcgcgtaa 
taccagcggt ggtttgtttg ccggatcaag 
gcttcagcag agcgcagata ccaaatactg 
acttcaagaa ctctgtagca ccgcctacat 

55 ctgctgccag tggcgataag tcgtgtctta 
ataaggcgca gcggtcgggc tgaacggggg 
cgacctacac cgaactgaga tacctacagc 
aagggagaaa ggcggacagg tatccggtaa 
gggagcttcc agggggaaac gcctggtatc 

60 gacttgagcg tcgatttttg tgatgctcgt 
gcaacgcggc ctttttacgg ttcctggcct 
ctgcgttatc ccctgattct gtggataacc 
ctcgccgcag ccgaacgacc gagcgcagcg 
caatacgcaa accgcctctc cccgcgcgtt 

65 ggtttcccga ctggaaagcg ggcagtgagc 
attaggcacc ccaggcttta cactttatgc 
gcggataaca atttcacaca ggaaacagct 



3 

gatacgccga acgatcgcca gttctgtatg 3480 
catccagcgc tgacggaagc aaaacaccag 3540 
caaaccatcg aagtgaccag cgaatacctg 3600 
tggatggtgg cgctggatgg taagccgctg 3660 
ccacaaggta aacagttgat tgaactgcct 3720 
ctctggctca cagtacgcgt agtgcaaccg 3780 
atcagcgcct ggcagcagtg gcgtctggcg 3840 
tcccacgcca tcccgcatct gaccaccagc 3900 
aagcgttggc aatttaaccg ccagtcaggc 3960 
aaacaactgc tgacgccgct gcgcgatcag 4020 
ggcgtaagtg aagcgacccg cattgaccct 4080 
ggccattacc aggccgaagc agcgttgttg 4140 
gtgctgatta cgaccgctca cgcgtggcag 4200 
aaaacctacc ggattgatgg tagtggtcaa 4260 
agcgatacac cgcatccggc gcggattggc 4320 
cgggtaaact ggctcggatt agggccgcaa 4380 
tgttttgacc gctgggatct gccattgtca 4440 
gaaaacggtc tgcgctgcgg gacgcgcgaa 4500 
gacttccagt tcaacatcag ccgctacagt 4560 
catctgctgc acgcggaaga aggcacatgg 4620 
ggtggcgacg actcctggag cccgtcagta 4 680 
taccattacc agttggtctg gtgtcaaaaa 4740 
gaaggaacct tacttctgtg gtgtgacata 4800 
ctctaaggta aatataaaat ttttaagtgt 48 60 
ttgtgtattt tagattccaa cctatggaac 4920 
atccagacat gataagatac attgatgagt 4980 
aaaaatgctt tatttgtgaa atttgtgatg 5040 
gcaataaaca agttaacaac aacaattgca 5100 
tgtgggaggt tttttaaagc aagtaaaacc 5160 
tgcggccgca gggcctcgtg atacgcctat 5220 
tttcttagac gtcaggtggc acttttcggg 5280 
ttttctaaat acattcaaat atgtatccgc 5340 
aataatattg aaaaaggaag agtatgagta 5400 
tttttgcggc attttgcctt cctgtttttg 54 60 
atgctgaaga tcagttgggt gcacgagtgg 5520 
agatccttga gagttttcgc cccgaagaac 5580 
tgctatgtgg cgcggtatta tcccgtattg 5640 
tacactattc tcagaatgac ttggttgagt 5700 
atggcatgac agtaagagaa ttatgcagtg 57 60 
ccaacttact tctgacaacg atcggaggac 5820 
tgggggatca tgtaactcgc cttgatcgtt 5880 
acgacgagcg tgacaccacg atgcctgtag 5940 
ctggcgaact acttactcta gcttcccggc 6000 
aagttgcagg accacttctg cgctcggccc 6060 
ctggagccgg tgagcgtggg tctcgcggta 6120 
cctcccgtat cgtagttatc tacacgacgg 6180 
gacagatcgc tgagataggt gcctcactga 6240 
actcatatat actttagatt gatttaaaac 6300 
agatcctttt tgataatctc atgaccaaaa 6360 
cgtcagaccc cgtagaaaag atcaaaggat 6420 
tctgctgctt gcaaacaaaa aaaccaccgc 6480 
agctaccaac tctttttccg aaggtaactg 6540 
tccttctagt gtagccgtag ttaggccacc 6600 
acctcgctct gctaatcctg ttaccagtgg 6660 
ccgggttgga ctcaagacga tagttaccgg 6720 
gttcgtgcac acagcccagc ttggagcgaa 6780 
gtgagctatg agaaagcgcc acgcttcccg 6840 
gcggcagggt cggaacagga gagcgcacga 6900 
tttatagtcc tgtcgggttt cgccacctct 6960 
caggggggcg gagcctatgg aaaaacgcca 7G20 
tttgctggcc ttttgctcac atgttctttc 7080 
gtattaccgc ctttgagtga gctgataccg 7140 
agtcagtgag cgaggaagcg gaagagcgcc 7200 
ggccgattca ttaatgcagc tggcacgaca 72 60 
gcaacgcaat taatgtgagt tagctcactc 7320 
ttccggctcg tatgttgtgt ggaattgtga 7380 
atgaccatga ttacgccaag ctggcgcg 7438 
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<210> 6 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer C31-1 
<400> 6 

ataagaatgc ggccgcccga tatgacacaa ggggttgtga ccggg 45 



15 <210> 7 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
20 <220> 

<223> Description of Artificial Sequence: primer C31-3 



<400> 7 

ataagaatgc ggccgcatcc gccgctacgt cttccgtgcc 4 0 



<210> 8 
<211> 24 
<212> DNA 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer C3J.-8 
35 <400> 8 

cccgttggca ggaagcactt ccgg 24 



<210> 9 
40 <211> 55 
<212> DNA 

<213> Artificial Sequence 
<220> 

45 <223> Description of Artificial Sequence: primer C31-9 
<400> 9 

ggatcctcga gccgcgggcg gccgcctacg ccgctacgtc ttccgtgccg tcctg 55 

50 

<210> 10 
<211> 5711 
<212> DNA 

<213> Artificial Sequence 

55 

<220> 

<223> Description of Artificial Sequence: vector 
pCMV-C32-Int (wt) 

60 <400> 10 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 

tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 

cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 

atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 

65 tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 

cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 

tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
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cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 
ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
5 attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 
cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 

10 cagggcggcc gcccgatatg acacaagggg ttgtgaccgg ggtggacacg tacgcgggtg 1020 
cttacgaccg tcagtcgcgc gagcgcgaga attcgagcgc agcaagccca gcgacacagc 108 0 
gtagcgccaa cgaagacaag gcggccgacc ttcagcgcga agtcgagcgc gacgggggcc 1140 
ggttcaggtt cgtcgggcat ttcagcgaag cgccgggcac gtcggcgttc gggacggcgg 1200 
agcgcccgga gttcgaacgc atcctgaacg aatgccgcgc cgggcggctc aacatgatca 12 60 

15 ttgtctatga cgtgtcgcgc ttctcgcgcc tgaaggtcat ggacgcgatt ccgattgtct 1320 
cggaattgct cgccctgggc gtgacgattg tttccactca ggaaggcgtc ttccggcagg 1380 
gaaacgtcat ggacctgatt cacctgatta tgcggctcga cgcgtcgcac aaagaatctt 14 4 0 
cgctgaagtc ggcgaagatt ctcgacacga agaaccttca gcgcgaattg ggcgggtacg 1500 
tcggcgggaa ggcgccttac ggcttcgagc ttgtttcgga gacgaaggag atcacgcgca 1560 

20 acggccgaat ggtcaatgtc gtcatcaaca agcttgcgca ctcgaccact ccccttaccg 1620 
gacccttcga gttcgagccc gacgtaatcc ggtggtggtg gcgtgagatc aagacgcaca 1680 
aacaccttcc cttcaagccg ggcagtcaag ccgccattca cccgggcagc atcacggggc 1740 
tttgtaagcg catggacgct gacgccgtgc cgacccgggg cgagacgatt gggaagaaga 1800 
ccgcttcaag cgcctgggac ccggcaaccg ttatgcgaat ccttcgggac ccgcgtattg 1860 

25 cgggcttcgc cgctgaggtg atctacaaga agaagccgga cggcacgccg accacgaaga 1920 
ttgagggtta ccgcattcag pgcgacccga tcacgctccg gccggtcgag cttgattgcg 1980 
gaccgatcat cgagcccgct gagtggtatg agcttcaggc gtggttggac ggcagggggc 2040 
gcggcaaggg gctttcccgg gggcaagcca ttctgtccgc catggacaag ctgtactgcg 2100 
agtgtggcgc cgtcatgact tcgaagcgcg gggaagaatc gatcaaggac tcttaccgct 2160 

30 gccgtcgccg gaaggtggtc gacccgtccg cacctgggca gcacgaaggc acgtgcaacg 2220 
tcagcatggc ggcactcgac aagttcgttg cggaacgcat cttcaacaag atcaggcacg 2280 
ccgaaggcga cgaagagacg ttggcgcttc tgtgggaagc cgcccgacgc ttcggcaagc 2340 
tcactgaggc gcctgagaag agcggcgaac gggcgaacct tgttgcggag cgcgccgacg 2400 
ccctgaacgc ccttgaagag ctgtacgaag accgcgcggc aggcgcgtac gacggacccg 24 60 

35 ttggcaggaa gcacttccgg aagcaacagg cagcgctgac gctccggcag caaggggcgg 2520 
aagagcggct tgccgaactt gaagccgccg aagccccgaa gcttcccctt gaccaatggt 2580 
tccccgaaga cgccgacgct gacccgaccg gccctaagtc gtggtggggg cgcgcgtcag 2 640 
tagacgacaa gcgcgtgttc gtcgggctct tcgtagacaa gatcgttgtc acgaagtcga 2700 
ctacgggcag ggggcaggga acgcccatcg agaagcgcgc ttcgatcacg tgggcgaagc 2760 

40 cgccgaccga cgacgacgaa gacgacgccc aggacggcac ggaagacgta gcggcgtagg 2820 
cggcgcccgg gctcgagatc caggcgcgga tcaataaaag atcattattt tcaat agate 2880 
tgtgtgttgg ttttttgtgt gccttggggg agggggaggc cagaatgagg cgcggccaag 2940 
ggggaggggg aggecagaat gaccttgggg gagggggagg ccagaatgac cttgggggag 3000 
ggggaggeca gaatgaggcg cgcccccggg taccgagctc gaattcactg gccgtcgttt 3060 

45 tacaaegteg tgactgggaa aaccctggcg ttacccaact taategcett gcagcacatc 3120 
cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct tcccaacagt 3180 
tgcgcagcct gaatggcgaa tggcgcctga tgcggtafctt tctccttacg catctgtgcg 3240 
gtatttcaca ccgcatatgg tgcactctca gtacaatctg ctctgatgcc gcatagttaa 3300 
gccagccccg acacccgcca acacccgctg acgcgccctg aegggcttgt ctgctcccgg 3360 

50 catccgctta cagacaagct gtgacegtet ccgggagctg catgtgtcag aggttttcac 3420 
cgtcatcacc gaaacgcgcg agacgaaagg gectegtgat aegectattt ttataggtta 3480 
atgtcatgat aataatggtt tettagaegt caggtggcac ttttcgggga aatgtgcgcg 3540 
gaacccctat ttgtttattt ttctaaatac attcaaatat gtatccgctc atgagacaat 3600 
aaccctgata aatgcttcaa taatattgaa aaaggaagag tatgagtatt caacatttcc 3660 

55 gtgtcgccct tattcccttt tttgeggcat tttgccttcc tgtttttgct cacccagaaa 3720 
cgctggtgaa agtaaaagat gctgaagatc agttgggtgc acgagtgggt tacatcgaac 378 0 
tggatctcaa cageggtaag atccttgaga gttttcgccc egaagaaegt tttccaatga 3840 
tgagcacttt taaagttctg ctatgtggcg eggtattate cegtattgae geegggcaag 3900 
ageaactegg tcgccgcata cactattctc agaatgactt ggttgagtac tcaccagtca 3960 

60 cagaaaagca tettaeggat ggcatgacag taagagaatt atgcagtgct gccataacca 4020 
tgagtgataa cactgcggcc aacttacttc tgacaacgat eggaggaccg aaggagctaa 4080 
ccgctttttt gcacaacatg ggggatcatg taactcgcct tgatcgttgg gaaceggage 4140 
tgaatgaagc cataccaaac gaegagegtg acaccacgat gectgtagea atggcaacaa 4200 
cgttgcgcaa actattaact ggegaactae ttactctagc ttcccggcaa caattaatag 42 60 

65 actggatgga ggcggataaa gttgeaggae cacttctgcg ctcggccctt ccggctggct 4320 
ggtttattgc tgataaatct ggagccggtg agcgtgggtc tegeggtate attgeagcac 4380 
tggggecaga tggtaagece tcccgtatcg tagttatcta cacgaegggg agtcaggcaa 444 0 
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ctatggatga acgaaataga cagatcgctg agataggtgc ctcactgatt aagcattggt 4500 
aactgtcaga ccaagtttac tcatatatac tttagattga tttaaaactt catttttaat 4560 
ttaaaaggat ctaggtgaag atcctttttg ataatctcat gaccaaaatc ccttaacgtg 4 620 
agttttcgtt ccactgagcg tcagaccccg tagaaaagat caaaggatct tcttgagatc 4680 
5 ctttttttct gcgcgtaatc tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg 4740 
tttgtttgcc ggatcaagag ctaccaactc tttttccgaa ggtaactggc ttcagcagag 4800 
cgcagatacc aaatactgtc cttctagtgt agccgtagtt aggccaccac ttcaagaact 4860 
ctgtagcacc gcctacatac ctcgctctgc taatcctgtt accagtggct gctgccagtg 4920 
gcgataagtc gtgtcttacc gggttggact caagacgata gttaccggat aaggcgcagc 4 980 

10 ggtcgggctg aacggggggt tcgtgcacac agcccagctt ggagcgaacg acctacaccg 5040 
aactgagata cctacagcgt gagctatgag aaagcgccac gcttcccgaa gggagaaagg 5100 
cggacaggta tccggtaagc ggcagggtcg gaacaggaga gcgcacgagg gagcttccag 5160 
ggggaaacgc ctggtatctt tatagtcctg tcgggtttcg ccacctctga cttgagcgtc 5220 
gatttttgtg atgctcgtca ggggggcgga gccta'tggaa aaacgccagc aacgcggcct 5280 

15 ttttacggtt cctggccttt tgctggcctt ttgctcacat gttctttcct gcgttatccc 5340 
ctgattctgt ggataaccgt attaccgcct ttgagtgagc tgataccgct cgccgcagcc 5400 
gaacgaccga gcgcagcgag tcagtgagcg aggaagcgga agagcgccca atacgcaaac 54 60 
cgcctctccc cgcgcgttgg ccgattcatt aatgcagctg gcacgacagg tttcccgact 5520 
ggaaagcggg cagtgagcgc aacgcaatta atgtgagtta gctcactcat taggcacccc 5580 

20 aggctttaca ctttatgctt ccggctcgta tgttgtgtgg aattgtgagc ggataacaat 5640 
ttcacacagg aaacagctat gaccatgatt acgccaagct agcccgggct agcttgcatg 5700 
cctgcaggtt t 5711 



25 <210> 11 
<211> 69 
<212> DKA 

<213> Artificial Sequence 
30 <220> 

<223> Description of Artificial Sequence: primer C31-2-2 
<400> 11 

tagaattccg ctcgagagtc taaaccttcc tcttcttctt aggcgccgct acgtcttccg 60 
35 tgccgtcct 69 



<210> 12 
<211> 5723 
40 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
45 pCMV-C31-Int{CNLS) 



<400> 12 

cctgcaggtt taaacagtcc gatgtacggg ccagatatac gcgttgacat tgattattga 60 

ctagttatta atagtaatca attacggggt cattagttca tagcccatat atggagttcc 120 

50 gcgttacata acttacggta aatggcccgc ctggctgacc gcccaacgac ccccgcccat 180 

tgacgtcaat aatgacgtat gttcccatag taacgccaat agggactttc cattgacgtc 240 

aatgggtgga ctatttacgg taaactgccc acttggcagt acatcaagtg tat cat at gc 300 

caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tatgcccagt 360 

acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta 420 

55 ccatggtgat gcggttttgg cagtacatca atgggcgtgg atagcggttt gactcacggg 480 

gatttccaag tctccacccc attgacgtca atgggagttt gttttggcac caaaatcaac 540 

gggactttcc aaaatgtcgt aacaactccg ccccattgac gcaaatgggc ggtaggcgtg 600 

tacggtggga ggtctatata agcagagctc tctggctaac tagagaaccc actgcttact 660 

ggcttatcga aattaatacg actcactata gggagaccca agctgactct agacttaatt 720 

60 aagcgttggg gtgagtactc cctctcaaaa gcgggcatga cttctgcgct aagattgtca 780 

gtttccaaaa acgaggagga tttgatattc acctggcccg cggtgatgcc tttgagggtg 840 

gccgcgtcca tctggtcaga aaagacaatc tttttgttgt caagcttgag gtgtggcagg 900 

cttgagatct ggccatacac ttgagtgaca ttgacatcca ctttgccttt ctctccacag 960 

gtgtccactc ccagggcggc cgcccgatat gacacaaggg gttgtgaccg gggtggacac 1020 

65 gtacgcgggt gcttacgacc gtcagtcgcg cgagcgcgag aattcgagcg cagcaagccc 1080 

agcgacacag cgtagcgcca acgaagacaa ggcggccgac cttcagcgcg aagtcgagcg 1140 

cgacgggggc cggttcaggt tcgtcgggca tttcagcgaa gcgccgggca cgtcggcgtt 1200 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT/EPO 1/12975 



cgggacggcg gagcgcccgg agttcgaacg 
caacatgatc attgtctatg acgtgtcgcg 
tccgattgtc tcggaattgc tcgccctggg 
cttccggcag ggaaacgtca tggacctgat 
5 caaagaatct tcgctgaagt cggcgaagat 
gggcgggtac gtcggcggga aggcgcctta 
gatcacgcgc aacggccgaa tggtcaatgt 
tccccttacc ggacccttcg agttcgagcc 
caagacgcac aaacaccttc ccttcaagcc 

10 catcacgggg ctttgtaagc gcatggacgc 
tgggaagaag accgcttcaa gcgcctggga 
cccgcgtatt gcgggcttcg ccgctgaggt 
gaccacgaag attgagggtt accgcattca 
gcttgattgc ggaccgatca tcgagcccgc 

15 cggcaggggg cgcggcaagg ggctttcccg 
gctgtactgc gagtgtggcg ccgtcatgac 
ctcttaccgc tgccgtcgcc ggaaggtggt 
cacgtgcaac gtcagcatgg cggcactcga 
gatcaggcac gccgaaggcg acgaagagac 

20 cttcggcaag ctcactgagg cgcctgagaa 
gcgcgccgac gccctgaacg cccttgaaga 
cgacggaccc gttggcagga agcacttccg 
gcaaggggcg gaagagcggc ttgccgaact 
tgaccaatgg ttccccgaag acgccgacgc 

25 gcgcgcgtca gtagacgaca agcgcgtgtt 
cacgaagtcg actacgggca gggggcaggg 
gtgggcgaag ccgccgaccg acgacgacga 
agcggcgcct aagaagaaga ggaaggttta 
aagatcatta ttttcaatag atctgtgtgt 

30 ggccagaatg aggcgcggcc aagggggagg 
aggccagaat gaccttgggg gagggggagg 
ctcgaattca ctggccgtcg ttttacaacg 
acttaatcgc cttgcagcac atcccccttt 
caccgatcgc ccttcccaac agttgcgcag 

35 ttttctcctt acgcatctgt gcggtatttc 
ctgctctgat gccgcatagt taagccagcc 
ctgacgggct tgtctgctcc cggcatccgc 
ctgcatgtgt. cagaggtttt caccgtcatc 
gatacgccta tttttatagg ttaatgtcat 

40 cacttttcgg ggaaatgtgc gcggaacccc 
tatgtatccg ctcatgagac aataaccctg 
gagtatgagt attcaacatt tccgtgtcgc 
tccfcgttttt gctcacccag aaacgctggt 
tgcacgagtg ggttacatcg aactggatct 

45 ccccgaagaa cgttttccaa tgatgagcac 
atcccgtatt gacgccgggc aagagcaact 
cttggttgag tactcaccag tcacagaaaa 
attatgcagt gctgccataa ccatgagtga 
gatcggagga ccgaaggagc taaccgcttt 

50 ccttgatcgt tgggaaccgg agctgaatga 
gatgcctgta gcaatggcaa caacgttgcg 
agcttcccgg caacaattaa tagactggat; 
gcgctcggcc cttccggctg gctggtttat 
gtctcgcggt atcattgcag cactggggcc 

55 ctacacgacg gggagtcagg caactatgga 
tgcctcactg attaagcatt ggtaactgtc 
tgatttaaaa cttcattttt aatttaaaag 
catgaccaaa atcccttaac gtgagttttc 
gatcaaagga tcttcttgag atcctttttt 

60 aaaaccaccg ctaccagcgg tggtttgttt 
gaaggtaact ggcttcagca gagcgcagat 
gttaggccac cacttcaaga actctgtagc 
gttaccagtg gctgctgcca gtggcgataa 
atagttaccg gataaggcgc agcggtcggg 

65 cttggagcga acgacctaca ccgaactgag 
cacgcttccc gaagggagaa aggcggacag 
agagcgcacg agggagcttc cagggggaaa 
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catcctgaac gaatgccgcg ccgggcggct 12 60 
cttctcgcgc ctgaaggtca tggacgcgat 1320 
cgtgacgatt gtttccactc aggaaggcgt 1380 
tcacctgatt atgcggctcg acgcgtcgca 1440 
tctcgacacg aagaaccttc agcgcgaatt 1500 
cggcttcgag cttgtttcgg agacgaagga 1560 
cgtcatcaac aagcttgcgc actcgaccac 1620 
cgacgtaatc cggtggtggt ggcgtgagat 1680 
gggcagtcaa gccgccattc acccgggcag 1740 
tgacgccgtg ccgacccggg gcgagacgat 1800 
cccggcaacc gttatgcgaa tccttcggga 18 60 
gatctacaag aagaagccgg acggcacgcc 1920 
gcgcgacccg atcacgctcc ggccggtcga 1980 
tgagtggtat gagcttcagg cgtggttgga 2040 
ggggcaagcc attctgtccg ccatggacaa 2100 
ttcgaagcgc ggggaagaat cgatcaagga 2160 
cgacccgtcc gcacctgggc agcacgaagg 2220 
caagttcgtt gcggaacgca tcttcaacaa 2280 
gttggcgctt ctgtgggaag ccgcccgacg 2340 
gagcggcgaa cgggcgaacc ttgttgcgga 2400 
gctgtacgaa gaccgcgcgg caggcgcgta 2460 
gaagcaacag gcagcgctga cgctccggca 2520 
tgaagccgcc gaagccccga agcttcccct 2580 
tgacccgacc ggccctaagt cgtggtgggg 2 640 
cgtcgggctc ttcgtagaca agatcgttgt 2700 
aacgcccatc gagaagcgcg cttcgatcac 27 60 
agacgacgcc caggacggca cggaagacgt 2820 
gactctcgag atccaggcgc ggatcaataa 2880 
tggttttttg tgtgccttgg gggaggggga 2940 
gggaggccag aatgaccttg ggggaggggg 3000 
ccagaatgag gcgcgccccc gggtaccgag 3060 
tcgtgactgg gaaaaccctg gcgttaccca 3120 
cgccagctgg cgtaatagcg aagaggcccg 3180 
Gctgaatggc gaatggcgcc tgatgcggta 3240 
acaccgcata tggtgcactc tcagtacaat 3300 
ccgacacccg ccaacacccg ctgacgcgcc 3360 
ttacagacaa gctgtgaccg tctccgggag 3420 
accgaaacgc gcgagacgaa agggcctcgt 3480 
gataataatg gtttcttaga cgtcaggtgg 3540 
tatttgttta tttttctaaa tacattcaaa 3600 
ataaatgctt caataatatt gaaaaaggaa 3660 
ccttattccc ttttttgcgg cattttgcct 3720 
gaaagtaaaa gatgctgaag atcagttggg 3780 
caacagcggt aagatccttg agagttttcg 384 0 
ttttaaagtt ctgctatgtg gcgcggtatt 3900 
cggtcgccgc atacactatt ctcagaatga 3960 
gcatcttacg gatggcatga cagtaagaga 4020 
taacactgcg gccaacttac ttctgacaac 4080 
tttgcacaac atgggggatc. atgtaactcg 4140 
agccatacca aacgacgagc gtgacaccac 4200 
caaactatta actggcgaac tacttactct 4260 
ggaggcggat aaagttgcag gaccacttct 4320 
tgctgataaa tctggagccg gtgagcgtgg 4380 
agatggtaag ccctcccgta tcgtagttat 4440 
tgaacgaaat agacagatcg ctgagatagg 4500 
agaccaagtt tactcatata tactttagat 4560 
gatctaggtg aagatccttt ttgataatct 4620 
gttccactga gcgtcagacc ccgtagaaaa 4 680 
tctgcgcgta atctgctgct tgcaaacaaa 4740 
gccggatcaa gagctaccaa ctctttttcc 4800 
accaaatact gtccttctag tgtagccgta 4860 
accgcctaca tacctcgctc tgctaatcct 4 920 
gtcgtgtctt accgggttgg actcaagacg 4 980 
ctgaacgggg ggttcgtgca cacagcccag 5040 
atacctacag cgtgagctat gagaaagcgc 5100 
gtatccggta agcggcaggg tcggaacagg 5160 
cgcctggtat ctttatagtc ctgtcgggtt 5220 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT/EP01/12975 
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tcgccacctc 
gaaaaacgcc 
catgttcttt 
agctgatacc 
ggaagagcgc 
ctggcacgac 
ttagctcact 
tggaattgtg 
gctagcccgg 



tgacttgagc 
agcaacgcgg 
cctgcgttat 
gctcgccgca 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
gctagcttgc 



gtcgattttt 
cctttttacg 
cccctgattc 
gccgaacgac 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
atg 



8 

gtgatgctcg 
gttcctggcc 
tgtggataac 
cgagcgcagc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 



tcaggggggc 

ttttgctggc 

cgtattaccg 

gagtcagtga 

tggccgattc . 

cgcaacgcaa 

cttccggctc 

tatgaccatg 



ggagcctatg 
cttttgctca 
cctttgagtg 
gcgaggaagc 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgccaa 



5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5723 



<210> 13 
<211> 4960 
<212> DNA 
15 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
pCMV-Cre 

20 

<400> 13 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 
cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 

25 atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 

30 ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 

35 cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 
cagggcggcc tcgaccatgc ccaagaagaa gaggaaggtg tccaatttac tgaccgtaca 1050 
ccaaaatttg cctgcattac cggtcgatgc aacgagtgat gaggttcgca agaacctgat 1080 

40 ggacatgttc agggatcgcc aggcgttttc tgagcatacc tggaaaatgc ttctgtccgt 1140 
ttgccggtcg tgggcggcat ggtgcaagtt gaataaccgg aaatggtttc ccgcagaacc 1200 
tgaagatgtt cgcgattatc ttctatatct tcaggcgcgc ggtctggcag taaaaactat 1260 
ccagcaacat ttgggccagc taaacatgct tcatcgtcgg tccgggctgc cacgaccaag 1320 
tgacagcaat gctgtttcac tggttatgcg gcggatccga aaagaaaacg ttgatgccgg 1380 

45 tgaacgtgca aaacaggctc tagcgttcga acgcactgat ttcgaccagg ttcgttcact 1440 
catggaaaat agcgatcgct gccaggatat acgtaatctg gcatttctgg ggattgctta 1500 
taacaccctg ttacgtatag ccgaaattgc caggatcagg gttaaagata tctcacgtac 1560 
tgacggtggg agaatgttaa tccatattgg cagaacgaaa acgctggtta gcaccgcagg 1620 
tgtagagaag gcacttagcc tgggggtaac taaactggtc gagcgatgga tttccgtctc 1680 

50 tggtgtagct gatgatccga ataactacct gttttgccgg gtcagaaaaa atggtgttgc 1740 
cgcgccatct gccaccagcc agctatcaac tcgcgccctg gaagggattt ttgaagcaac 1800 
tcatcgattg atttacggcg ctaaggatga ctctggtcag agatacctgg cctggtctgg 18 60 
acacagtgcc cgtgtcggag ccgcgcgaga tatggcccgc gctggagttt caataccgga 1920 
gatcatgcaa gctggtggct ggaccaatgt aaatattgtc atgaactata tccgtaacct 1980 

55 ggatagtgaa acaggggcaa tggtgcgcct gctggaagat ggcgattagc cattaacgcg 2040 
taaatgattg cagatccact agttctaggg ccgcgtcgac ctcgagatcc aggcgcggat 2100 
caataaaaga tcattatttt caatagatct gtgtgttggt tttttgtgtg ccttggggga 2160 
gggggaggcc agaatgaggc gcggccaagg gggaggggga ggccagaatg accttggggg 2220 
agggggaggc cagaatgacc ttgggggagg gggaggccag aatgaggcgc gcccccgggt 2280 

60 accgagctcg aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 2340 
tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 2400 
ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat ggcgcctgat 24 60 
gcggtatttt ctccttacgc atctgtgcgg tatttcacac cgcatatggt gcactctcag 2520 
tacaatctgc tctgatgccg catagttaag ccagccccga cacccgccaa cacccgctga 2580 

65 cgcgccctga cgggcttgtc tgctcccggc atccgcttac agacaagctg tgaccgtctc 2640 
cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg aaacgcgcga gacgaaaggg 2700 
cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt cttagacgtc 27 60 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT/EPO 1/12975 



aggtggcact tttcggggaa atgtgcgcgg 

ttcaaatatg tatccgctca tgagacaata 

aaggaagagt atgagtattc aacatttccg 

ttgccttcct gtttttgctc acccagaaac 

5 gttgggtgca cgagtgggtt acatcgaact 

ttttcgcccc gaagaacgtt ttccaatgat 

ggtattatcc cgtattgacg ccgggcaaga 

gaatgacttg gttgagtact caccagtcac 

aagagaatta tgcagtgctg ccataaccat 

10 gacaacgatc ggaggaccga aggagctaac 

aactcgcctt gatcgttggg aaccggagct 

caccacgatg cctgtagcaa tggcaacaac 

tactctagct tcccggcaac aattaataga 

acttctgcgc tcggcccttc cggctggctg 

15 gcgtgggtct cgcggtatca ttgcagcact 

agttatctac acgacgggga gtcaggcaac 

gataggtgcc tcactgatta agcattggta 

ttagattgat ttaaaacttc atttttaatt 

taatctcatg accaaaatcc cttaacgtga 

20 agaaaagatc aaaggatctt cttgagatcc 

aacaaaaaaa ccaccgctac cagcggtggt 

ttttccgaag gtaactggct tcagcagagc 

gccgtagtta ggccaccact tcaagaactc 

aatcctgtta ccagtggctg ctgccagtgg 

25 aagacgatag ttaccggata aggcgcagcg 

gcccagcttg gagcgaacga cctacaccga 

aagcgccacg cttcccgaag ggagaaaggc 

aacaggagag cgcacgaggg agcttccagg 

cgggtttcgc cacctctgac ttgagcgtcg 

30 cctatggaaa aacgccagca acgcggcctt 

tgctcacatg ttctttcctg cgttatcccc 

tgagtgagct gataccgctc gccgcagccg 

ggaagcggaa gagcgcccaa tacgcaaacc 

atgcagctgg cacgacaggt ttcccgactg 

35 tgtgagttag ctcactcatt aggcacccca 

gttgtgtgga attgtgagcg gataacaatt 

cgccaagcta gcccgggcta gcttgcatgc 
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aacccctatt tgtttatttt tctaaataca 2820 
accctgataa atgcttcaat aatattgaaa 2880 
tgtcgccctt attccctttt ttgcggcatt 2940 
gctggtgaaa gtaaaagatg ctgaagatca 3000 
ggatctcaac agcggtaaga tccttgagag 3060 
gagcactttt aaagttctgc tatgtggcgc 3120 
gcaactcggt cgccgcatac actattctca 3180 
agaaaagcat cttacggatg gcatgacagt 3240 
gagtgataac actgcggcca acttacttct 3300 
cgcttttttg cacaacatgg gggatcatgt 3360 
gaatgaagcc ataccaaacg acgagcgtga 3420 
gttgcgcaaa ctattaactg gcgaactact 3480 
ctggatggag gcggataaag ttgcaggacc 3540 
gtttattgct gataaatctg gagccggtga 3600 
ggggccagat ggtaagccct cccgtatcgt 3660 
tatggatgaa cgaaatagac agatcgctga 3720 
actgtcagac caagtttact catatatact 3780 
taaaaggatc taggtgaaga tcctttttga 384 0 
gttttcgttc cactgagcgt cagaccccgt 3900 
tttttttctg cgcgtaatct gctgcttgca 3960 
ttgtttgccg gatcaagagc taccaactct 4020 
gcagatacca aatactgtcc ttctagtgta 4080 
tgtagcaccg cctacatacc tcgctctgct 4140 
cgataagtcg tgtcttaccg ggttggactc 4200 
gtcgggctga acggggggtt cgtgcacaca 4260 
actgagatac ctacagcgtg agctatgaga 4320 
ggacaggtat ccggtaagcg gcagggtcgg 4380 
gggaaacgcc tggtatcttt atagtcctgt 4440 
atttttgtga tgctcgtcag, gggggcggag 4500 
tttacggttc ctggcctttt gctggccttt 4560 
tgattctgtg gataaccgta ttaccgcctt 4620 
aacgaccgag cgcagcgagt cagtgagcga 4 680 
gcctctcccc gcgcgttggc cgattcatta 4740 
gaaagcgggc agtgagcgca acgcaattaa 4800 
ggctttacac tttatgcttc cggctcgtat 48 60 
tcacacagga aacagctatg accatgatta 4 920 
ctgcaggttt 4960 



40 <210> 14 

<211> 3858 
<212> DNA 

<213> Artificial Sequence 
45 <220> 

<223> Description of Artificial Sequence: vector pRK50 
<400> 14 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 

50 tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 
cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 
atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtpa atgggtggac 240 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 

55 tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 
ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 

60 attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagt act cc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 
cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct- ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 

65 cagggcggcc gcgtcgacct cgagatccag gcgcggatca ataaaagatc attattttca 1020 
atagatctgt gtgttggttt tttgtgtgcc ttgggggagg gggaggccag aatgaggcgc 1080 
ggccaagggg gagggggagg ccagaatgac cttgggggag ggggaggcca gaatgacctt 1140 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

10 

gggggagggg gaggccagaa tgaggcgcgc ccccgggtac cgagctcgaa ttcactggcc 1200 

gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta cccaacttaa tcgccttgca 12 60 

gcacatcccc ctttcgccag ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc 1320 

caacagttgc gcagcctgaa tggcgaatgg cgcctgatgc ggtattttct ccttacgcat 1380 

5 ctgtgcggta tttcacaccg catatggtgc actctcagta caatctgctc tgatgccgca 144 0 

tagttaagcc agccccgaca cccgccaaca cccgctgacg cgccctgacg ggcttgtctg 1500 

ctcccggcat ccgcttacag acaagctgug accgtctccg ggagctgcat gtgtcagagg 1560 

ttttcaccgt catcaccgaa acgcgcgaga cgaaagggcc tcgtgatacg cctattttta 1620 

taggttaatg tcatgataat aatggtttct tagacgtcag gtggcacttt tcggggaaat 1680 

10 gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta tccgctcatg 17 40 

agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat gagtattcaa 1800 

catttccgtg tcgcccttat tccctttttt gcggcatttt gccttcctgt ttttgctcac 18 60 

ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt tgggtgcacg agtgggttac 1920 

atcgaactgg atctcaacag cggtaagatc cttgagagtt ttcgccccga agaacgtttt 1980 

15 ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg tattgacgcc 2040 

gggcaagagc aactcggtcg ccgcatacac tattctcaga atgacttggt tgagtactca 2100 

ccagtcacag aaaagcatct tacggatggc atgacagtaa gagaattatg cagtgctgcc 2160 

ataaccatga gtgataacac tgcggccaac ttacttctga caacgatcgg aggaccgaag 2220 

gagctaaccg cttttttgca^caacatgggg gatcatgtaa ctcgccttga tcgttgggaa 2280 

20 ccggagctga atgaagccat ^iccaaacgac gagcgtgaca ccacgatgcc tgtagcaatg 2340 

gcaacaacgt tgcgcaaact attaactggc gaactactta ctctagcttc ccggcaacaa 2400 

ttaatagact ggatggaggc g§fe£aaagtt gcaggaccac ttctgcgctc ggcccttccg 24 60 

gctggctggt ttattgctga taaatctgga gccggtgagc gtgggtctcg cggtatcatt 2520 

gcagcactgg ggccagatgg taagccctcc cgtatcgtag ttatctacac gacggggagt 2580 

25 caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc actgattaag 2640 

cattggtaac tgtcagacca agtttactca tatatacttt agattgattt aaaacttcat 2700 

ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac caaaatccct 27 60 

taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa aggatcttct 2820 

tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca 2880 

30 gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt aactggcttc 2940 

agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg ccaccacttc 3000 

aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc agtggctgct 3060 

gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt accggataag 3120 

gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga gcgaacgacc 3180 

35 tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct tcccgaaggg 3240 

agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg cacgagggag 3300 

cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca cctctgactt 3360 

gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac 3420 

gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt ctttcctgcg 3480 

40 ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga taccgctcgc 3540 

cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga gcgcccaata 3600 

cgcaaaccgc ctct ccccgc gcgttggccg attcattaat gcagctggca cgacaggttt 3660 

cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct cactcattag 3720 

gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat tgtgagcgga 3780 

45 taacaatttc acacaggaaa cagctatgac catgattacg ccaagctagc ccgggctagc 3840 

ttgcatgcct gcaggttt 3858 



<210> 15 
50 <211> 6257 
<212> DNA 

<213> Artificial Sequence 
<220> 

55 <223> Description of Artificial Sequence: vector 
pRK64 (deltaCre) 

<400> 15 

cgtcatcacc gaaacgcgcg aggcagctgt ggaatgtgtg tcagttaggg tgtggaaagt 60 

60 ccccaggctc cccagcaggc agaagtatgc aaagcatgca tctcaattag tcagcaacca 120 

ggctccccag caggcagaag tgtgcaaagc atgcatctca attagtcagc aaccatagtc 180 

ccgcccctaa ctccgcccat cccgccccta actccgccca gttccgccca ttctccgccc 240 
catggctgac taattttttt tatttatgca gaggccgagg ccgcctcggc ctaggaacag 300 

tcgacgacac tgcagagacc tacttcacta acaaccggta cagttcgtgg accagatggg 360 

65 tgaggtggag tacgcgcccg gggagcccaa gggcacgccc tggcacccgc accgcggctt 420 

cgagaccgtc acgaatagat ccataacttc gtatagcata cattatacga agttataccg 480 

ggccaccatg gtcgcgagta gcttggcact ggccgtcgtt ttacaacgtc gtgactggga 54 0 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



PCT/EPO 1/12975 



aaaccctggc gttacccaac ttaatcgcct 
taatagcgaa gaggcccgca ccgatcgccc 
atggcgcttt gcctggtttc cggcaccaga 
tcttcctgag gccgatactg tcgtcgtccc 
5 .gcccatctac accaacgtaa cctatcccat 
gaatccgacg ggttgttact cgctcacatt 
ccagacgcga attatttttg atggcgttaa 
ctgggtcggt tacggccagg acagtcgttt 
acgcgccgga gaaaaccgcc tcgcggtgat 

10 ggaagatcag gatatgtggc ggatgagcgg 
accgactaca caaatcagcg atttccatgt 
cgctgtactg gaggctgaag ttcagatgtg 
agtttcttta tggcagggtg aaacgcaggt 
a'attatcgat gagcgtggtg gttatgccga 

15 cccgaaactg tggagcgccg aaatcccgaa 
cgccgacggc acgctgattg aagcagaagc 
tgaaaatggt ctgctgctgc tgaacggcaa 
cgagcatcat cctctgcatg gtcaggtcat 
gctgatgaag cagaacaact ttaacgccgt 

20 gtggtacacg ctgtgcgacc gctacggcct 
ccacggcatg gtgccaatga atcgtctgac 
cgaacgcgta acgcgaatgg tgcagcgcga 
gctggggaat gaatcaggcc acggcgctaa 
tgtcgatcct tcccgcccgg tgcagtatga 

25 tattatttgc ccgatgtacg cgcgcgtgga 
atggtccatc aaaaaatggc tttcgctacc 
atacgcccac gcgatgggta acagtcttgg 
tcagtatccc cgtttacagg gcggcttcgt 
atatgatgaa aacggcaacc cgtggtcggc 

30 cgatcgccag ttctgtatga acggtctggt 
gacggaagca aaacaccagc agcagttttt 
agtgaccagc gaatacctgt tccgtcatag 
gctggatggt aagccgctgg caagcggtga 
acagttgatt gaactgcctg aactaccgca 

35 agtacgcgta gtgcaaccga acgcgaccgc 
gcagcagtgg cgtctggcgg aaaacctcag 
cccgcatctg accaccagcg aaatggattt 
atttaaccgc cagtcaggct ttctttcaca 
gacgccgctg cgcgatcagt tcacccgtgc 

40 agcgacccgc attgacccta acgcctgggt 
ggccgaagca gcgttgttgc agtgcacggc 
gaccgctcac gcgtggcagc atcaggggaa 
gattgatggt agtggtcaaa tggcgattac 
gcatccggcg cggattggcc tgaactgcca 

45 gctcggatta gggccgcaag aaaactatcc 
ctgggatctg ccattgtcag acatgtatac 
gcgctgcggg acgcgcgaat tgaattatgg 
caacatcagc cgctacagtc aacagcaact 
cgcggaagaa ggcacatggc tgaatatcga 

50 ctcctggagc ccgtcagtat cggcggaatt 
gttggtctgg tgtcaaaaat aataataacc 
acttctgtgg tgtgacataa ttggacaaac 
atataaaatt tttaagtgta taatgtgtta 
agattccaac ctatggaact gatgaatggg 

55 ataagataca ttgatgagtt tggacaaacc 
atttgtgaaa tttgtgatgc tattgcttta 
gttaacaaca acaattgcat tcattttatg 
ttttaaagca agtaaaacct ctacaaatgt 
ggcctcgtga tacgcctatt tttataggtt 

60 tcaggtggca cttttcgggg aaatgtgcgc 
cattcaaata tgtatccgct catgagacaa 
aaaaggaaga gtatgagtat tcaacatttc 
ttttgccttc ctgtttttgc tcacccagaa 
cagttgggtg cacgagtggg ttacatcgaa 

65 agttttcgcc ccgaagaacg ttttccaatg 
gcggtattat cccgtattga cgccgggcaa 
cagaatgact tggttgagta ctcaccagtc 
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tgcagcacat ccccctttcg ccagctggcg 600 
ttcccaacag ttgcgcagcc tgaatggcga 660 
agcggtgccg gaaagctggc tggagtgcga 720 
ctcaaactgg cagatgcacg gttacgatgc 780 
tacggtcaat ccgccgtttg ttcccacgga 840 
taatgttgat gaaagctggc tacaggaagg 900 
ctcggcgttt catctgtggt gcaacgggcg 960 
gccgtctgaa tttgacctga gcgcattttt 1020 
ggtgctgcgt tggagtgacg gcagttatct 1080 
cattttccgt gacgtctcgt tgctgcataa 1140 
tgccactcgc tttaatgatg atttcagccg 1200 
cggcgagttg cgtgactacc tacgggtaac* 1260 
cgccagcggc accgcgcctt tcggcggtga 1320 
tcgcgtcaca ctacgtctga acgtcgaaaa 1380 
tctctatcgt gcggtggttg aactgcacac 1440 
ctgcgatgtc ggtttccgcg aggtgcggat 1500 
gccgttgctg attcgaggcg ttaaccgtca 1560 
ggatgagcag acgatggtgc aggatatcct 1620 
gcgctgttcg cattatccga accatccgct 1680 
gtatgtggtg gatgaagcca atattgaaac 1740 
cgatgatccg cgctggctac cggcgatgag 1800 
tcgtaatcac ccgagtgtga tcatctggtc 18 60 
tcacgacgcg ctgtatcgct ggatcaaatc 1920 
aggcggcgga gccgacacca cggccaccga 1980 
tgaagaccag cccttcccgg ctgtgccgaa 2040 
tggagagacg cgcccgctga tcctttgcga 2100 
cggtttcgct aaatactggc aggcgtttcg 2160 
ctgggactgg gtggatcagt cgctgattaa 2220 
ttacggcggt gattttggcg atacgccgaa 2280 
ctttgccgac cgcacgccgc atccagcgct 2340 
ccagttccgt ttatccgggc aaaccatcga 2400 
cgataacgag ctcctgcact ggatggtggc 24 60 
agtgcctctg gatgtcgctc cacaaggtaa 2520 
gccggagagc gccgggcaac tctggctcac 2580 
atggtcagaa gccgggcaca tcagcgcctg 2640 
tgtgacgctc cccgccgcgt cccacgccat 2700 
ttgcatcgag ctgggtaata agcgttggca 27 60 
gatgtggatt ggcgataaaa aacaactgct 2820 
accgctggat aacgacattg gcgtaagtga 2880 
cgaacgctgg aaggcggcgg gccattacca 2 94 0 
agatacactt gctgatgcgg tgctgattac 3000 
aaccttattt atcagccgga aaacctaccg 30 60 
cgttgatgtt gaagtggcga gcgatacacc 3120 
gctggcgcag gtagcagagc gggtaaactg 318 0 
cgaccgcctt actgccgcct gttttgaccg 324 0 
cccgtacgtc ttcccgagcg aaaacggtct 3300 
cccacaccag tggcgcggcg acttccagtt 3360 
gatggaaacc agccatcgcc atctgctgca 3420 
cggtttccat atggggattg gtggcgacga 3480 
ccagctgagc gccggtcgct accattacca 3540 
gggcaggggg gatctttgtg aaggaacctt 3600 
tacctacaga gatttaaagc tctaaggtaa 3660 
aactactgat tctaattgtt tgtgtatttt 3720 
agcagtggtg gaatgccaga tccagacatg 3780 
acaactagaa tgcagtgaaa aaaatgcttt 3840 
tttgtaacca ttataagctg caataaacaa 3900 
tttcaggttc agggggaggt gtgggaggtt 3960 
ggtatggctg attatgatct gcggccgcag 4020 
aatgtcatga taataatggt ttcttagacg 4080 
ggaaccccta tttgtttatt tttctaaata 4140 
taaccctgat aaatgcttca ataatattga 4200 
cgtgtcgccc ttattccctt ttttgcggca 4260 
acgctggtga aagtaaaaga tgctgaagat 4320 
ctggatctca acagcggtaa gatccttgag 4380 
atgagcactt ttaaagttct gctatgtggc 44 4 0 
gagcaactcg gtcgccgcat acactattct 4500 
acagaaaagc atcttacgga tggcatgaca 4560 



SUBSTITUTE SHEET (RULE 26) 
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gtaagagaat tatgcagtgc 
ctgacaacga tcggaggacc 
gtaactcgcc ttgatcgttg 
gacaccacga tgcctgtagc 
5 cttactctag cttcccggca 
ccacttctgc gctcggccct 
gagcgtgggt ctcgcggtat 
gtagttatct acacgacggg 
gagataggtg cctcactgat 

10 ctttagattg atttaaaact 
gataatctca tgaccaaaat 
gtagaaaaga tcaaaggatc 
caaacaaaaa aaccaccgct 
ctttttccga aggtaactgg 

15 tagccgtagt taggccacca 
ctaatcctgt taccagtggc 
tcaagacgat agttaccgga 
cagcccagct tggagcgaac 
gaaagcgcca cgcttcccga 

20 ggaacaggag agcgcacgag 
gtcgggtttc gccacctctg 
agcctatgga aaaacgccag 
tttgctcaca tgttctttcc 
tttgagtgag ctgataccgc 

25 gaggaagcgg aagagcgccc 
taatgcagct ggcacgacag 
aatgtgagtt agctcactca 
atgttgtgtg gaattgtgag 
tacgccaagc tggcgcg 



<210> 16 
<211> 6252 
<212> DHA 
35 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
pRK64 {del taint) 

40 

<400> 16 

cgtcatcacc gaaacgcgcg aggcagctgt ggaatgtgtg tcagttaggg tgtggaaagt 60 

ccccaggctc cccagcaggc agaagtatgc aaagcatgca tctcaattag tcagcaacca 120 

ggctccccag caggcagaag tgtgcaaagc atgcatctca attagtcagc aaccatagtc 180 

45 ccgcccctaa ctccgcccat cccgccccta actccgccca gttccgccca ttctccgccc 240 

catggctgac taattttttt tatttatgca gaggccgagg ccgcctcggc ctaggaacag 300 

tcgacgacac tgcagagacc tacttcacta acaaccggta cagttcgtgg accagatggg 360 

tgaggtggag tacgcgcccg gggagcccaa aggttacccc agttggggca ctactcccga 420 

aaaccgcttc tggatccata acttcgtata gcatacatta tacgaagtta taccgggcca 480 

50 ccatggtcgc gagtagcttg gcactggccg tcgttttaca acgtcgtgac tgggaaaacc 540 

ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata 600 

gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatggc 660 

gctttgcctg gtttccggca ccagaagcgg tgccggaaag ctggctggag tgcgatcttc 720 

ctgaggccga tactgtcgtc gtcccctcaa actggcagat gcacggttac gatgcgccca 780 

55 tctacaccaa cgtaacctat cccattacgg tcaatccgcc gtttgttccc acggagaatc 84 0 

cgacgggttg ttactcgctc acatttaatg ttgatgaaag ctggctacag gaaggccaga 900 

cgcgaattat ttttgatggc gttaactcgg cgtttcatct gtggtgcaac gggcgctggg 960 

tcggttacgg ccaggacagt cgtttgccgt ct gaatttga cctgagcgca tttttacgcg 1020 

ccggagaaaa ccgcctcgcg gtgatggtgc tgcgttggag tgacggcagt tatctggaag 1080 

60 atcaggatat gtggcggatg agcggcattt tccgtgacgt ctcgttgctg cataaaccga 1140 

ctacacaaat cagcgatttc catgttgcca ctcgctttaa tgatgatttc agccgcgctg 1200 

tactggaggc tgaagttcag atgtgcggcg agttgcgtga ctacctacgg gtaacagttt 12 60 

ctttatggca gggtgaaacg caggtcgcca gcggcaccgc gcctttcggc ggtgaaatta 1320 

tcgatgagcg tggtggttat gccgatcgcg tcacactacg tctgaacgtc gaaaacccga 1380 

65 aactgtggag cgccgaaatc ccgaatctct atcgtgcggt ggttgaactg cacaccgccg 144 0 

acggcacgct gattgaagca gaagcctgcg atgtcggttt ccgcgaggtg cggattgaaa 1500 

atggtctgct gctgctgaac ggcaagccgt tgctgattcg aggcgttaac cgtcacgagc 1560 



SUBSTITUTE SHEET (RULE 26) 



tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
tcatttttaa 
cccttaacgt 
ttcttgagat 
accagcggtg 
cttcagcaga 
cttcaagaac 
tgctgccagt 
taaggcgcag 
gacctacacc 
agggagaaag 
ggagcttcca 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
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atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
tttaaaagga 
gagttttcgt 
cctttttttc 
gtttgtttgc 
gcgcagatac 
tctgtagcac 
ggcgataagt 
cggtcgggct 
gaactgagat 
gcggacaggt 
gggggaaacg 
cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 



acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
tctaggtgaa 
tccactgagc 
tgcgcgtaat 
cggatcaaga 
caaatactgt 
cgcctacata 
cgtgtcttac 
gaacgggggg 
acctacagcg 
atccggtaag 
cctggtatct 
gatgctcgtc 
tcctggcctt 
tggataaccg 
agcgcagcga 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 



caacttactt 
gggggatcat 
cgacgagcgt 
tggcgaacta 
agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ct cat at at a 
gatccttttt 
gtcagacccc 
ctgctgcttg 
gctaccaact 
ccttctagtg 
cctcgctctg 
cgggttggac 
ttcgtgcaca 
tgagctatga 
cggcagggtc 
ttatagtcct 

aggggggcgg 

ttgctggcct 
tattaccgcc 
gtcagtgagc 
gccgattcab 
caacgcaatt 
tccggctcgt 
tgaccatgat 



4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6257 
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atcatcctct gcatggtcag gtcatggatg 
tgaagcagaa caactttaac gccgtgcgct 
acacgctgtg cgaccgctac ggcctgtatg 
gcatggtgcc aatgaatcgt ctgaccgatg 
5 gcgtaacgcg aatggtgcag cgcgatcgta 
ggaatgaatc aggccacggc gctaatcacg 
atccttcccg cccggtgcag tatgaaggcg 
tttgcccgat gtacgcgcgc gtggatgaag 
ccatcaaaaa atggctttcg ctacctggag 
cccacgcgat gggtaacagt cttggcggtt 
atccccgttt acagggcggc ttcgtctggg 
atgaaaacgg caacccgtgg tcggcttacg 
gccagttctg tatgaacggt ctggtctttg 
aagcaaaaca ccagcagcag tttttccagt 

15 ccagcgaata cctgttccgt catagcgata 
atggtaagcc gctggcaagc ggtgaagtgc 
tgattgaact gcctgaacta ccgcagccgg 
gcgtagtgca accgaacgcg accgcatggt 
agtggcgtct ggcggaaaac ctcagtgtga 

20 atctgaccac cagcgaaatg gatttttgca 
accgccagtc aggctttctt tcacagatgt 
cgctgcgcga tcagttcacc cgtgcaccgc 
cccgcattga ccctaacgcc tgggtcgaac 
aagcagcgtt gttgcagtgc acggcagata 

25 ctcacgcgtg gcagcatcag gggaaaacct 
atggtagtgg tcaaatggcg attaccgttg 
cggcgcggat tggcctgaac tgccagctgg 
gattagggcc gcaagaaaac tatcccgacc 
atctgccatt gtcagacatg tataccccgt 

30 gcgggacgcg cgaattgaat tatggcccac 
tcagccgcta cagtcaacag caactgatgg 
aagaaggcac atggctgaat atcgacggtt 
ggagcccgtc agtatcggcg gaattccagc 
tctggtgtca aaaataataa taaccgggca 

35 tgtggtgtga cataattgga caaactacct 
aaatttttaa gtgtataatg tgttaaacta 
ccaacctatg gaactgatga atgggagcag 
atacattgat gagtttggac aaaccacaac 
tgaaatttgt gatgctattg ctttatttgt 

40 caacaacaat tgcattcatt ttatgtttca 
aagcaagtaa aacctctaca aatgtggtat 
cgtgatacgc ctatttttat aggttaatgt 
tggcactttt cggggaaatg tgcgcggaac 
aaatatgtat ccgctcatga gacaataacc 

45 gaagagtatg agtattcaac atttccgtgt 
ccttcctgtt tttgctcacc cagaaacgct 
gggtgcacga gtgggttaca tcgaactgga 
tcgccccgaa gaacgttttc caatgatgag 
attatcccgt attgacgccg ggcaagagca 

50 tgacttggtt gagtactcac cagtcacaga 
agaattatgc agtgctgcca taaccatgag 
aacgatcgga ggaccgaagg agctaaccgc 
tcgccttgat cgttgggaac cggagctgaa 
cacgatgcct gtagcaatgg caacaacgtt 

55 tctagcttcc cggcaacaat taatagactg 
tctgcgctcg gcccttccgg ctggctggtt 
tgggtctcgc ggtatcattg cagcactggg 
tatctacacg acggggagtc aggcaactat 
aggtgcctca ctgattaagc attggtaact 

60 gattgattta aaacttcatt tttaatttaa 
tctcatgacc aaaatccctt aacgtgagtt 
aaagatcaaa ggatcttctt gagatccttt 
aaaaaaacca ccgctaccag cggtggtttg 
tccgaaggta actggcttca gcagagcgca 

65 gtagttaggc caccacttca agaactctgt 
cctgttacca gtggctgctg ccagtggcga 
acgatagtta ccggataagg cgcagcggtc 



13 

agcagacgat ggtgcaggat atcctgctga 1620 
gttcgcatta tccgaaccat ccgctgtggt 1680 
tggtggatga agccaatatt gaaacccacg 174 0 
atccgcgctg gctaccggcg atgagcgaac 1800 
atcacccgag tgtgatcatc tggtcgctgg 1860 
acgcgctgta tcgctggatc aaatctgtcg 1920 
gcggagccga caccacggcc accgatatta 1980 
accagccctt cccggctgtg ccgaaatggt 2040 
agacgcgccc gctgatcctt tgcgaatacg 2100 
tcgctaaata ctggcaggcg tttcgtcagt 2160 
actgggtgga tcagtcgctg attaaatatg 2220 
gcggtgattt tggcgatacg ccgaacgatc 2280 
ccgaccgcac gccgcatcca gcgctgacgg 2340 
tccgtttatc cgggcaaacc atcgaagtga 2400 
acgagctcct gcactggatg gtggcgctgg 24 60 
ctctggatgt cgctccacaa ggtaaacagt 2520 
agagcgccgg gcaactctgg ctcacagtac 2580 
cagaagccgg gcacatcagc gcctggcagc 2640 
cgctccccgc cgcgtcccac gccatcccgc 2700 
tcgagctggg taataagcgt tggcaattta 27 60 
ggattggcga taaaaaacaa ctgctgacgc 2820 
tggataacga cattggcgta agtgaagcga 2880 
gctggaaggc ggcgggccat taccaggccg 2940 
cacttgctga tgcggtgctg attacgaccg 3000 
tatttatcag ccggaaaacc taccggattg 3060 
atgttgaagt ggcgagcgat acaccgcatc 3120 
cgcaggtagc agagcgggta aactggctcg 3180 
gccttactgc cgcctgtttt gaccgctggg 3240 
acgtcttccc gagcgaaaac ggtctgcgct 3300 
accagtggcg cggcgacttc cagttcaaca 3360 
aaaccagcca tcgccatctg ctgcacgcgg 3420 
tccatatggg gattggtggc gacgactcct 34 80 
tgagcgccgg tcgctaccat taccagttgg 3540 
ggggggatct ttgtgaagga accttacttc 3600 
acagagattt aaagctctaa ggtaaatata 3660 
ctgattctaa ttgtttgtgt attttagatt 3720 
tggtggaatg ccagatccag acatgataag 3780 
tagaatgcag tgaaaaaaat gctttatttg 3840 
aaccattata agctgcaata aacaagttaa 3900 
ggttcagggg gaggtgtggg aggtttttta 3960 
ggctgattat gatctgcggc cgcagggcct 4020 
catgataata atggtttctt agacgtcagg 4 080 
ccctatttgt ttatttttct aaatacattc 4140 
ctgataaatg cttcaataat attgaaaaag 4200 
cgcccttatt cccttttttg cggcattttg 4260 
ggtgaaagta aaagatgctg aagatcagtt 4320 
tctcaacagc ggtaagatcc ttgagagttt 4380 
cacttttaaa ' gttctgctat gtggcgcggt 444 0 
actcggtcgc cgcatacact attctcagaa 4500 
aaagcatctt acggatggca tgacagtaag 4560 
tgataacact gcggccaact tacttctgac 4 620 
ttttttgcac aacatggggg atcatgtaac 4 680 
tgaagccata ccaaacgacg agcgtgacac 4740 
gcgcaaacta ttaactggcg aactacttac 4800 
gatggaggcg gataaagttg caggaccact 4 860 
tattgctgat aaatctggag ccggtgagcg 4 920 
gccagatggt aagccctccc gtatcgtagt 4980 
ggatgaacga aatagacaga tcgctgagat 5040 
gtcagaccaa gtttactcat atatacttta 5100 
aaggatctag gtgaagatcc tttttgataa 5160 
ttcgttccac tgagcgtcag accccgtaga 5220 
ttttctgcgc gtaatctgct gcttgcaaac 5280 
tttgccggat caagagctac caactctttt 5340 
gataccaaat actgtccttc tagtgtagcc 5400 
agcaccgcct acataccrtcg ctctgctaat 54 60 
taagtcgtgt cttaccgggt tggactcaag 5520 
gggctgaacg gggggttcgt gcacacagcc 5580 
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cagcttggag cgaacgacct acaccgaact gagataccta cagcgtgagc tatgagaaag 5640 

cgccacgctt cccgaaggga gaaaggcgga caggtatccg gtaagcggca gggtcggaac 5700 

aggagagcgc acgagggagc ttccaggggg aaacgcctgg tatctttata gtcctgtcgg 5760 

gtttcgccac ctctgacttg agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct 5820 

5 atggaaaaac gccagcaacg cggccttttt acggttcctg gccttttgct ggccttttgc 5880 

tcacatgttc tttcctgcgt tatcccctga ttctgtggat aaccgtatta ccgcctttga 594 0 

gtgagctgat accgctcgcc gcagccgaac gaccgagcgc agcgagtcag tgagcgagga 6000 

agcggaagag cgcccaatac gcaaaccgcc tctccccgcg cgttggccga ttcattaatg 6060 

cagctggcac gacaggtttc ccgactggaa agcgggcagt gagcgcaacg caattaatgt 6120 

10 gagttagctc actcattagg caccccaggc tttacacttt atgcttccgg ctcgtatgtt 6180 

gtgtggaatt gtgagcggat aacaatttca cacaggaaac agctatgacc atgattacgc 6240 

caagctggcg eg 6252 



15 <210> 17 
<211> 25 
<212> DMA 

<213> Artificial Sequence 
20 <220> 

<223> Description of Artificial Sequence: primer F64-1 



25 



<400> 17 

tcagcaacca ggctccccag caggc 25 



<210> 18 
<211> 27 
<212> DNA 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer 64-4 
35 <400> 18 

gacgacagta tcggcctcag gaagatc 27 



<210> 19 
40 <211> 840 
<212> DNA 

<213> Artificial Sequence 
<220> 

45 <223> Description of Artificial Sequence; 
oligonucleotide 80d 



50 



55 



60 



<400> 19 

ggtaccgagc 

ccaggctccc 

tgtggaaagt 

tcagcaacca 

gcccattctc 

teggectagg 

cgtggaccag 

gggcactact 

agttataccg 

gtgactggga 

ccagctggcg 

gaatggcgaa 

ggagtgcgat 

cactggcggc 



tcggatcctc 
cagcaggcag 
ccccaggctc 
tagtcccgcc 
cgccccatgg 
aacagtcgac 
atgggtgagg 
cccgaaaacc 
ggccaccatg 
aaaccctggc 
taatagcgaa 
tggcgctttg 
cttcctgagg 
cgctcgagca 



tagtaaegge 
aagtatgcaa 
cc cagcaggc 
cctaactccg 
ctgactaatt 
gaeactgeag 
tggagtaege 
gcttctggat 
gtcgcgagta 
gttacccaac 
gaggcccgca 
cctggcttcc 
ccgatactgt 
tgcatctaga 



cgccagtgtg 
ageatgeate 
agaagtatgc 
cccatcccgc 
ttttttattt 
agacctactt 
geceggggag 
ccataacttc 
gcttggcact 
ttaatcgect 
ccgatcgccc 
ggcaccagaa 
cgtcaagccg 
gggecaatte 



ctggaattcg 
tcaattagtc 
aaagcatgea 
ccctaactcc 
atgeagagge 
cactaacaac 
cccaaaggtt 
gtatagcata 
ggggttgctt 
tgcagcacat 
ttcccaacag 
gcggtgccgg 
aattctgcag 
gecctatagt 



gcttcagcaa 60 
agcaaccagg 120 
tctcaattag 180 
gcccagttcc 240 
cgaggccgcc 300 
eggtacagtt 360 
accccagttg 420 
cattatacga 4 80 
ttgcgnygtc 54 0 
ccccctttcg 600 
1 1 gcg cage t 660 
aaagctggct 720 
atatccatca 780 
gagtegtatt 840 



65 <210> 20 

<211> 1842 
<212> DNA 
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<213> Bacteriophage phi-C31 

<220> 

<221> CDS 

<222> (1) . . (1839) 

<400> 20 

atg aca caa ggg gtt gtg acc ggg gtg gac acg tac gcg ggt get tac 48 

Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 

1.5 10 15 

gac cgt cag teg cgc gag cgc gag aat teg age gca gca age cca gcg 96 

Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 

20 ~ 25 30 

aca cag cgt age gee aac gaa gac aag gcg gec gac ctt cag cgc gaa 144 

Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 

35 40 45 

gtc gag cgc gac ggg ggc egg ttc agg ttc gtc ggg cat ttc age gaa 192 

Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 

50 55 60 

gcg ccg ggc acg teg gcg ttc ggg acg gcg gag cgc ccg gag ttc gaa 240 

Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu Phe Glu 

65 70 75 80 

cgc ate ctg aac gaa tgc cgc gee ggg egg etc aac atg ate att gtc 288 

Arg lie Leu Asn Glu Gys Arg Ala Gly Arg Leu Asn Met lie lie Val 

85 90 95 

tat gac gtg teg cgc ttc teg cgc ctg aag gtc atg gac gcg att ccg 336 

Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala lie Pro 

100 ~ 105 - 110 

att gtc teg gaa ttg etc gec ctg ggc gtg acg att gtt tec act cag 384 

He Val Ser Glu Leu Leu Ala Leu Gly Val Thr lie Val Ser Thr Gin 

115 120 125 

gaa ggc gtc ttc egg cag gga aac gtc atg gac ctg att cac ctg att 432 

Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu lie His Leu lie 

130 135 140 

atg egg etc gac gcg teg cac aaa gaa tct teg ctg aag teg gcg aag 4 80 

Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 

145 150 155 160 

att etc gac acg aag aac ctt cag cgc gaa ttg ggc ggg tac gtc ggc 528 

He Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 

165 170 " 175 

ggg aag gcg cct tac ggc ttc gag ctt gtt teg gag acg aag gag ate 57 6 

Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu He 

180 185 190 

acg cgc aac ggc cga atg gtc aat gtc gtc ate aac aag ctt gcg cac 624 

Thr Arg Asn Gly Arg Met Val Asn Val Val He Asn Lys Leu Ala His 

195 200 205 



teg acc act ccc ctt acc gga ccc ttc gag ttc gag ccc gac gta ate 
Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val He 
210 215 220 



672 



C 9<J t^g tgg tgg cgt gag ate aag acg cac aaa cac ctt ccc ttc aag 720 
Arg Trp Trp Trp Arg Glu He Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 
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ccg ggc agt caa gcc gcc att cac ccg ggc age ate acg ggg ctt tgt 7 68 

Pro Gly Ser Gin Ala Ala He His Pro Gly Ser He Thr Gly Leu Cys 

245 250 255 

aag cgc atg gac get gac gcc gtg ccg acc egg ggc gag acg att ggg 816 

Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Gly 

260 265 270 

aag aag acc get tea age gcc tgg gac ccg gca acc gtt atg cga ate 8 64 

Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 

275 280 * 285 

ctt egg gac ccg cgt att gcg ggc ttc gcc get gag gtg ate tac aag 912 

Leu Arg Asp Pro Arg lie Ala Gly Phe Ala Ala Giu Val He Tyr Lys 

290 ~ 295 300 

aag aag ccg gac ggc acg ccg acc acg aag att gag ggt tac cgc att 960 

Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 

305 ~ 310 315 320 

cag cgc gac ccg ate acg etc egg ccg gtc gag ctt gat tgc gga ccg 1008 

Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 

325 330 335 

ate ate gag ccc get gag tgg tat gag ctt cag gcg tgg r ttg gac ggc 1056 

He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 

340 345 350 

a 9"9" 9"gQ" cgc ggc aag ggg ctt tec egg ggg caa gcc att ctg tec gcc 1104 

Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 

355 ^ ~ 360 365 

atg gac aag ctg tac tgc gag tgt ggc gcc gtc atg act teg aag cgc 1152 

Met Asp Lys Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arg 

370 375 380 

ggg gaa gaa teg ate aag gac tct tac cgc tgc cgt cgc egg aag gtg , 1200 

Gly Glu Glu Ser He Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 

385 390 395 400 

gtc gac ccg tec gca cct ggg cag cac gaa ggc acg tgc aac gtc age 1248 

Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 

405 410 415 

atg gcg gca etc gac aag ttc gtt gcg gaa cgc ate ttc aac aag ate 1296 

Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg lie Phe Asn Lys He 

420 425 430 

agg cac gcc gaa ggc gac gaa gag acg ttg gcg ctt ctg tgg gaa gcc 1344 

Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 

435 440 445 

gcc cga cgc ttc ggc aag etc act gag gcg cct gag aag age ggc gaa 1392 

Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 

450 455 460 

c 99 9"cg aac ctt gtt gcg gag cgc gcc gac gcc ctg aac gcc ctt gaa 1440 
Arg Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 

465 470 475 480 

gag ctg tac gaa gac cgc gcg gca ggc gcg tac gac gga ccc gtt ggc 1488 

Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 

485 490 495 

agg aag cac ttc egg aag caa cag gca gcg ctg acg etc egg cag caa 1536 
Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 

500 505 510 
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ggg gcg gaa gag egg ctt gec gaa ctt gaa gec gec gaa gec ccg aag 1534 
Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 

ctt ccc ctt gac caa tgg ttc ccc gaa gac gec gac get gac ccg acc 1632 
Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 535 ■ 540 

ggc cct aag teg tgg tgg ggg ego gcg tea gta gac gac aag cgc gtg 1680 
Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 550 555 ~ ~ 560 

ttc gtc ggg etc ttc gta gac aag ate gtt gtc acg aag teg act acg 1728 
Phe Val Gly Leu Phe Val Asp Lys lie Val Val Thr Lys Ser Thr Thr 
565 570 575 

ggc agg ggg cag gga acg ccc ate gag aag cgc get teg ate acg tgg 17 7 6 
Gly Arg Gly Gin Gly Thr Pro lie Glu Lys Arg Ala Ser lie Thr Trp 
580 585 590 

gcg aag ccg ccg acc gac gac gac gaa gac gac gec cag gac ggc acg 1824 
Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 ^ 600 605 

gaa gac gta gcg gcg tag 1842 
Glu Asp Val Ala Ala 
610 

<210> 21 
<211> 613 
<212> PRT 

<213> Bacteriophage phi-C31 
<400> 21 

Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
1 5 10 15 

Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
20 25 30 

Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
35 40 45 

Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 
50 55 60 

Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu. Phe Glu 
65 70 75 80 

Arg lie Leu Asn Glu Cys Arg Ala Gly Arg Leu Asn Met lie lie Val 
85 90 95 

Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala' He Pro 
100 105 110 

lie Val Ser Glu Leu Leu Ala Leu Gly Val Thr He Val Ser Thr Gin 
115 120 125 

Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu He His Leu He 
130 135 140 

Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
145 150 155 160 

lie Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 
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165 170 175 

Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lya Glu He 
180 185 190 

Thr Arg Asn Gly Arg Met Val Asn Val Val He Asn Lys Leu Ala His 
195 200 205 

Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val He 
210 215 220 

Arg Trp Trp Trp Arg Glu He Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 

Pro Gly Ser Gin Ala Ala He His Pro Gly Ser He Thr Gly Leu Cys 
245 250 255 

Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Gly 
260 265 270 

Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg lie 
275 280 285 

Leu Arg Asp pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
290 295 300 

Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 
325 330 335 

He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
340 345 . 350 

Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 365 

Met Asp Lys Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arg 
370 375 380 

Gly Glu Glu Ser He Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 
385 390 395 400 

Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 
405 410 415 

Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg He Phe Asn Lys He 
420 425 430 

Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 

Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 
450 455 460 

Arg Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 
465 470 475 480 

Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 
485 490 ~ 495 

Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
500 505 510 

Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 
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Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 535 540 

Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 ^ 550 * 555 560 

Phe Val Gly Leu Phe Val Asp Lys He Val Val Thr Lys Ser Thr Thr 
565 570 575 

Gly Arg Gly Gin Gly Thr Pro He Glu Lys Arg Ala Ser He Thr Trp 
580 585 590 

Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 600 605 

Glu Asp Val Ala Ala 
610 



<210> 22 
<211> 1863 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein C31-Int (CNLS) 

<220> 

<221> CDS 

<222> (1) . . (1860) 

<400> 22 

atg aca caa ggg gtt gtg acc ggg gtg gac acg tac gcg ggt get tac 48 
Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
15 10 15 

gac cgt cag teg cgc gag cgc gag aat teg age gca gca age cca gcg 96 
Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
20 25 30 

aca cag cgt age gec aac gaa gac aag gcg gec gac ctt cag cgc gaa 144 
Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
35 40 45 

gtc gag cgc gac ggg ggc egg ttc agg ttc gtc ggg cat ttc age gaa 192 
Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 
50 55 60 

gcg ccg ggc acg teg gcg ttc ggg acg gcg gag cgc ccg gag ttc gaa 240 
Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu Phe Glu 
65 - 70 75 80 

cgc ate ctg aac gaa tgc cgc gec ggg egg etc aac atg ate att gtc 288 
Arg lie Leu Asn Glu Gys Arg Ala Gly Arg Leu Asn Met He lie Val 
85 90 95 

tat gac gtg teg cgc ttc teg cgc ctg aag gtc atg gac gcg att ccg 336 
Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala He Pro 
100 105 HO 

att gtc teg gaa ttg etc gec ctg ggc gtg acg att gtt tec act cag 384 
He Val Ser Glu Leu Leu Ala Leu Gly Val Thr He Val Ser Thr Gin 
115 120 ~ 125 
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gaa ggc gtc ttc egg cag gga aac gtc atg gac ctg att cac ctg att 432 
Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu lie His Leu lie 
130 135 ' 140 

5 atg egg etc gac gcg teg cac aaa gaa tct teg ctg aag teg gcg aag 480 
Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
145 ~ 150 155 ~ 160 

att etc gac acg aag aac ctt eag cgc gaa ttg ggc ggg tac gtc ggc 528 
10 lie Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 

165 170 175 

ggg aag gcg cct tac ggc ttc gag ctt gtt teg gag acg aag gag ate 57 6 
Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu lie 
15 180 185 190 

acg cgc aac ggc cga atg gtc aat gtc gtc ate aac aag ctt gcg cac 624 

Thr Arg Asn Gly Arg Met Val Asn Val Val lie Asn Lys Leu Ala His 

195 200 205 

20 

teg ace act ccc ctt acc gga ccc ttc gag ttc gag ccc gac gta ate 672 

Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val lie 

210 ■ 215 220 

25 egg tgg tgg tgg cgt gag ate aag acg cac aaa cac ctt ccc ttc aag 720 
Arg Trp Trp Trp Arg Glu lie Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 

ccg ggc agt caa gec gec att cac ccg ggc age ate acg ggg ctt . tgt 768 
30 Pro Gly Ser Gin Ala Ala lie His Pro Gly Ser He Thr Gly Leu Cys 

245 250 255 

aag cgc atg gac get gac gee gtg ccg acc egg ggc gag acg att ggg 816 
Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr He Gly 
35 260 265 270 

aag aag acc get tea age gec tgg gac ccg gca acc gtt atg cga ate 8 64 

Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 
275 280 285 

40 

ctt egg gac ccg cgt att gcg ggc ttc gec get gag gtg ate tac aag 912 

Leu Arg Asp Pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
290 295 300 

45 aag aag ccg gac ggc acg ccg acc acg aag att gag ggt tac cgc att 960 
Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

cag cgc gac ccg ate acg etc egg ccg gtc gag ctt gat tgc gga ccg 1008 
50 Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 

325 330 335 

ate ate gag ccc get gag tgg tat gag ctt eag gcg tgg ttg gac ggc 1056 
He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
55 340 345 350 

a gg ggg cgc ggc aag ggg ctt tec egg ggg caa gec att ctg tec gec 1104 

Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 365 

60 

atg gac aag ctg tac tgc gag tgt ggc gee gtc atg act teg aag cgc 1152 

Met Asp Lys Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arg 
370 375 " 380 

65 ggg gaa gaa teg ate aag gac tct tac cgc tgc cgt cgc egg aag gtg 1200 
Gly Glu Glu Ser He Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 
385 390 395 400 
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gtc gac ccg tec gca cct ggg cag cac gaa ggc acg tgc aac gtc age 124 8 
Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 
405 410 415 

atg gcg gca etc gac aag ttc gtt gcg gaa cgc ate ttc aac aag ate 1296 
Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg lie Phe Asn Lys He 
420 ~ " 425 430 

agg cac gec gaa ggc gac gaa gag acg ttg gcg ctt ctg tgg gaa gec 134 4 
Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 

gec cga cgc ttc ggc aag etc act gag gcg cct gag aag age ggc gaa 1392 
Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala- Pro Glu Lys Ser Gly Glu 
450 455 460 

egg gcg aac ctt gtt gcg gag cgc gec gac gec ctg aac gec ctt gaa 144 0 
Arg Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 
465 470 475 480 

gag ctg tac gaa gac cgc gcg gca ggc gcg tac gac gga ccc gtt ggc 1488 
Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 
485 490 495 

agg aag cac ttc egg aag caa cag gca gcg ctg acg etc egg cag caa 1536 
Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
500 505 510 

ggg gcg gaa gag egg ctt gec gaa ctt gaa gee gee gaa gee ccg aag 1584 
Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 

ctt ccc ctt gac caa tgg ttc ccc gaa gac gec gac get gac ccg acc 1632 
Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 ~ 535 540 

ggc cct aag teg tgg tgg ggg cgc gcg tea gta gac gac aag cgc gtg 1680 
Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 550 555 560 

ttc gtc ggg etc ttc gta gac aag ate gtt gtc acg aag teg act acg 1728 
Phe Val Gly Leu Phe Val Asp Lys He Val Val Thr Lys Ser Thr Thr 
565 570 575 

ggc agg ggg cag gga acg ccc ate gag aag cgc get teg ate acg tgg 177 6 
Gly Arg Gly Gin Gly Thr Pro lie Glu Lys Arg Ala Ser lie Thr Trp 
580 585 590 

gcg aag ccg ccg acc gac gac gac gaa gac gac gec cag gac ggc acg 1824 
Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 600 ^ 605 

gaa gac gta gcg gcg cct aag aag aag agg aag gtt tag 18 63 

Glu Asp Val Ala Ala Pro Lys Lys Lys Arg Lys Val 
610 615 ~ 620 

<210> 23 
<211> 620 
<212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein C31^Int (CNLS ) 

<400> 23 

Met Thr Gin Gly Val Val Thr Gly Val Asp Thr Tyr Ala Gly Ala Tyr 
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15 10 15 

Asp Arg Gin Ser Arg Glu Arg Glu Asn Ser Ser Ala Ala Ser Pro Ala 
20 25 30 

Thr Gin Arg Ser Ala Asn Glu Asp Lys Ala Ala Asp Leu Gin Arg Glu 
35 40 45 

Val Glu Arg Asp Gly Gly Arg Phe Arg Phe Val Gly His Phe Ser Glu 
50 55 60 

Ala Pro Gly Thr Ser Ala Phe Gly Thr Ala Glu Arg Pro Glu Phe Glu 
65 70 75 80 

Arg lie Leu Asn Glu Cys Arg Ala Gly Arg Leu Asn Met lie lie Val 
85 90 95 

Tyr Asp Val Ser Arg Phe Ser Arg Leu Lys Val Met Asp Ala lie Pro 
100 105 110 

lie Val Ser Glu Leu Leu Ala Leu Gly Val Thr lie Val Ser Thr Gin 
115 120 125 

Glu Gly Val Phe Arg Gin Gly Asn Val Met Asp Leu lie His Leu lie 
130 135 140 

Met Arg Leu Asp Ala Ser His Lys Glu Ser Ser Leu Lys Ser Ala Lys 
145 150 155 160 

He Leu Asp Thr Lys Asn Leu Gin Arg Glu Leu Gly Gly Tyr Val Gly 
165 170 175 

Gly Lys Ala Pro Tyr Gly Phe Glu Leu Val Ser Glu Thr Lys Glu He 
180 185 190 

Thr Arg Asn Gly Arg Met Val Asn Val Val He Asn Lys Leu Ala His 
195 200 205 

Ser Thr Thr Pro Leu Thr Gly Pro Phe Glu Phe Glu Pro Asp Val He 
210 215 220 

Arg Trp Trp Trp Arg Glu He Lys Thr His Lys His Leu Pro Phe Lys 
225 230 235 240 

Pro Gly Ser Gin Ala Ala lie His Pro Gly Ser He Thr Gly Leu Cys 
245 250 255 

Lys Arg Met Asp Ala Asp Ala Val Pro Thr Arg Gly Glu Thr lie Gly 
260 265 270 

Lys Lys Thr Ala Ser Ser Ala Trp Asp Pro Ala Thr Val Met Arg He 
275 280 285 

Leu Arg Asp Pro Arg He Ala Gly Phe Ala Ala Glu Val He Tyr Lys 
290 295 300 

Lys Lys Pro Asp Gly Thr Pro Thr Thr Lys He Glu Gly Tyr Arg He 
305 310 315 320 

Gin Arg Asp Pro He Thr Leu Arg Pro Val Glu Leu Asp Cys Gly Pro 
325 330 335 

He He Glu Pro Ala Glu Trp Tyr Glu Leu Gin Ala Trp Leu Asp Gly 
340 345 350 

Arg Gly Arg Gly Lys Gly Leu Ser Arg Gly Gin Ala He Leu Ser Ala 
355 360 365 
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Met Asp Lys- Leu Tyr Cys Glu Cys Gly Ala Val Met Thr Ser Lys Arg 
370 375 380 

Gly Glu Glu Ser He Lys Asp Ser Tyr Arg Cys Arg Arg Arg Lys Val 
385 390 395 400 

Val Asp Pro Ser Ala Pro Gly Gin His Glu Gly Thr Cys Asn Val Ser 
405 410 415 

Met Ala Ala Leu Asp Lys Phe Val Ala Glu Arg He Phe Asn Lys He 
420 425 430 

Arg His Ala Glu Gly Asp Glu Glu Thr Leu Ala Leu Leu Trp Glu Ala 
435 440 445 

Ala Arg Arg Phe Gly Lys Leu Thr Glu Ala Pro Glu Lys Ser Gly Glu 
450 " 455 460 

Arg. Ala Asn Leu Val Ala Glu Arg Ala Asp Ala Leu Asn Ala Leu Glu 
465 470 ~ 475 480 

Glu Leu Tyr Glu Asp Arg Ala Ala Gly Ala Tyr Asp Gly Pro Val Gly 
485 * 490 ~ 495 

Arg Lys His Phe Arg Lys Gin Gin Ala Ala Leu Thr Leu Arg Gin Gin 
500 505 510 

Gly Ala Glu Glu Arg Leu Ala Glu Leu Glu Ala Ala Glu Ala Pro Lys 
515 520 525 

Leu Pro Leu Asp Gin Trp Phe Pro Glu Asp Ala Asp Ala Asp Pro Thr 
530 . 535 540 

Gly Pro Lys Ser Trp Trp Gly Arg Ala Ser Val Asp Asp Lys Arg Val 
545 550 " ~ 555 560 

Phe Val Gly Leu Phe Val Asp Lys He Val Val Thr Lys Ser Thr Thr 
565 570 575 

Gly Arg Gly Gin Gly Thr Pro He Glu Lys Arg Ala Ser He Thr Trp 
580 585 590 

Ala Lys Pro Pro Thr Asp Asp Asp Glu Asp Asp Ala Gin Asp Gly Thr 
595 600 605 

Glu Asp Val Ala Ala Pro Lys Lys Lys Arg Lys Val 
610 615 620 



<210> 24 
<211> 43 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 24 

Met Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Lys Cys Arg Leu Lys 
15 10 15 

Lys Leu Lys Cys Ser Lys Glu Lys Pro Lys Cys Ala Lys Cys Leu Lys 
20 25 30 

Lys Lys Lys Lys Arg Arg Arg Lys Thr Lys Arg 
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35 40 



<210> 25 
5 <211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

10 <223> Description of Artificial Sequence: NLS 
<400> 25 

lie Lys Tyr Phe Lys Lys Phe Pro Lys Asp 
15 10 

15 

<210> 26 
<211> 14 
<212> PRT 
20 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 
25 <400> 26 

Met Thr Gly Ser Lys Thr Arg Lys His Arg Gly Ser Gly Ala 
15 10 



30 <210> 27 
<211> 14 
<212> PRT 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 27 

Met Thr Gly Ser Lys His Arg Lys His Pro Gly Ser Gly Ala 
40 1 5 10 



<210> 28 
<211> 7 
45 <212> PRT 

<213> Artificial Sequence 



50 



55 



60 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 28 

Gly Lys Lys Arg Ser Lys Ala 
1 5 



<210> 29 
<211> 14 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 



<400> 29 

65 Pro Lys Lys Ala Arg Glu Asp Val Ser Arg Lys Arg Pro Arg 
1 5 10 
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<210> -30 
<211> 11 
<212> PRT 
5 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

10 ' <400> 30 

Ala Pro Lys Arg Lys Ser Gly Val Ser Lys Cys 
15 10 



15 <210> 31 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
20 <220> 

<223> Description of Artificial Sequence; NLS 
<400> 31 

Glu Glu Asp Gly Pro Gin Lys Lys Lys Arg Arg Leu 
25 1 ~ 5 10 



<210> 32 
<211> 8 
30 <212> PRT 

<213> Artificial Sequence 

<220> 

^ <223> Description of Artificial Sequence: NLS 
<400> 32 

Ala Pro Thr Lys Arg Lys Gly Ser 
1 5 

40 

<210> 33 
<211> 7 
<212> PRT 

<213> Artificial Sequence 

45 

<220> 

<223> Description of Artificial Sequence: NLS 
<400> 33 

50 Pro Asn Lys Lys Lys Arg Lys 
1 5 



<210> 34 
55 <211> 5 

<212> PRT 

<213> Artificial Sequence 
<220> 

60 <223> Description of Artificial Sequence: NLS 
<400> 34 

Lys Arg Pro Arg Pro 



65 



1 

<210> 35 
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<211> 11 
<212> PRT 

<213> Artificial Sequence 
5 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 35 

Cys Gly Gly Leu Ser Ser Lys Arg Pro Arg Pro 
10 1 5 "* 10 



<210> 36 
<211> 19 
15 <212> PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

20 

<400> 36 

Pro Pro Lys Lys Arg Met Arg Arg Arg lie Glu Pro Lys Lys Lys Lys 
1 5 10 15 

25 Lys Arg Pro 



<210> 37 
30 <211> 11 
<212> PRT 

<213> Artificial Sequence 



<220> 

35 <223> Description of Artificial Sequence; NLS 



40 



<400> 37 

Pro Phe Leu Asp Arg Leu Arg Arg Asp Gin Lys 
1 5 ~ " 10 



<210> 38 
<211> 9 
<212> PRT 
45 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

50 <400> 38 

Pro Lys Gin Lys Arg Lys Met Ala Arg 
1 5 



55 <210> 39 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
60 <220> 

<223> Description of Artificial Sequence; NLS 
<400> 39 

Ser Val Thr Lys Lys Arg Lys Leu Glu 
65 1 5 
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<210> 40 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 



<400> 40 

10 Cys Gly Gly Ala Ala Lys Arg Val Lys Leu Asp 
1 5 10 



<210> 41 
15 <211> 9 

<212> PRT 

<213> Artificial Sequence 
<220> 

20 <223> Description of Artificial Sequence: NLS 
<400> 41 

Pro Ala Ala Lys Arg Val Lys Leu Asp 
1 5 

25 



<210> 42 
<211> 11 
<212> PRT 
30 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

35 <400> 42 

Arg Gin Arg Arg Asn Glu Leu Lys Arg Ser Pro 
1 5 10 



40 <210> 43 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
45 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 43 

Pro Gin Ser Arg Lys Lys Leu Arg 

50 l 5 



<210> 44 
<211> 8 
55 <212> PRT 

<213> Artificial Sequence 



60 



65 



<220> 

<223> Description of Artificial Sequence: NLS 
<400> 44 

Pro Leu Leu Lys Lys lie Lys Gin 
1 " 5 



<210> 45 
<211> 7 
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<212> PRT 

<213> Artificial Sequence 
<220> 

5 <223> Description of Artificial Sequence: NLS 
<400> 45 

Pro Gin Pro Lys Lys Lya Pro 
1 ^5 

10 

<210> 46 
<211> 9 
<212> PRT 
15 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

20 <400> 46 . 

Ser Lys Arg Val Ala Lys Arg Lys Leu 

1. 5 



25 <210> 47 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
30 <220> 

<223> Description of Artificial Sequence: NLS 
<400> 47 

Ala Ser Lys Ser Arg Lys Arg Lys Leu 
35 l 5 



<210> 48 
<211> 16 
40 <212> PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: NLS 

45 

<400> 43 

Gly Gly Leu Cys Ser Ala Arg Leu His Arg His Ala Leu Leu Ala Thr 
1 5 10 15 

50 

<210> 49 
<211> 8 
<212> PRT 

<213> Artificial Sequence 

55 

<220> 

<223> Description of Artificial Sequence; NLS 
<400> 49 

60 Arg Lys Thr Lys Lys Lys lie Lys 
1 5 



<210> 50 
65 <211> 8 

<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: NLS 
<400> 50 

Arg Lys Leu Lys Lys Leu Gly Asn 
1 * 5 



<210> 51 
<211> S 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: WLS 
<400> 51 

Arg Lys Asp Arg Arg Gly Gly Arg 
, 1 5 



<210> 52 
<211> 18 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; NLS 
<400> 52 

Asp Thr Arg Glu Lys Lys Lys Phe Leu Lys Arg Arg Leu Leu Arg Leu 
15 10 15 

Asp Glu 



<210> 53 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: NLS 
<400> 53 

Pro Lys Lys Lys Arg Lys Val 
1 5 



<210> 54 
<211> 1410 
<212> DNA 

<213> Bacteriophage R4 

<220> 

<221> CDS 

<222> (1) - • (1407) 

<400> 54 

atg aat cga ggg ggg ccc act gta egg gec gac ate tac gtc cga ate 48 
Met Asn Arg Gly Gly Pro Thr Val Arg Ala Asp He Tyr Val Arg He 
1 5 10 15 

age ctg gac cgc aca ggg gaa gag etc ggg gtc gag cgc cag gag gag 96 
Ser Leu Asp Arg Thr Gly Glu Glu Leu Gly Val Glu Arg Gin Glu Glu 
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20 25 30 

teg tgt cgc gag etc tgc aag age etc ggc atg gag gtg ggg cag gtg 144 
Ser Cys Arg Glu Leu Cys Lys Ser Leu Gly Met Glu Val Gly Gin Val 
35 40 45 

tgg gtc gac aac gac ctg age gee ace aag aag aac gtc gtc cgc cct 192 
Trp Val Asp Asn Asp Leu Ser Ala Thr Lys Lys Asn Val Val Arg Pro 
50 55 60 

gac ttc gag gcg atg ate gcg age aac ccg cag gcg ate gtc tgc tgg 240 
Asp Phe Glu Ala Met lie Ala Ser Asn Pro Gin Ala lie Val Cys Trp 
65 70 75 80 

cac ace gac egg etc ate cgc gtc acg egg gac ctg gag egg gtg ate 288 
His Thr Asp Arg Leu lie Arg Val Thr Arg Asp Leu Glu Arg Val lie 
85 90 95 

gac etc gga gtc aac gtc cac gee gtg atg gec gga cac ctg gac ctg 336 
Asp Leu Gly Val Asn Val His Ala Val Met Ala Gly His Leu Asp Leu 
100 105 110 

tec acc ccg gec ggc cga gee gtc gec cgc acg gtg acg gee tgg gee 384 
Ser Thr Pro Ala Gly Arg Ala Val Ala Arg Thr Val Thr Ala Trp Ala 
115 120 125 

acg tac gag ggc gag cag aag get gag cgc cag aag etc gec aac ate 432 
Thr Tyr Glu Gly Glu Gin Lys Ala Glu Arg Gin Lys Leu Ala Asn lie 
130 135 140 

cag aac gec cgc gec ggc aag ccg tac acc ccc ggc ate cgc ccc ttc 480 
Gin Asn Ala Arg Ala Gly Lys Pro Tyr Thr Pro Gly He Arg Pro Phe 
145 150 155 160 

ggg tac ggc gac gac cac atg acc ate gtg acg gee gag gcg gac gee 528 
Gly Tyr Gly Asp Asp His Met Thr He Val Thr Ala Glu Ala Asp Ala 
165 170 175 

ate cgc gac ggc gcg aag atg ate etc gac ggc tgg tec ctg teg gec 57 6 
He Arg Asp Gly Ala Lys Met lie Leu Asp Gly Trp Ser Leu Ser Ala 
180 185 190 

gtg get cgc tac tgg gag gag etc aag etc cag teg ccc egg agt atg 624 
Val Ala Arg Tyr Trp Glu Glu Leu Lys Leu Gin Ser Pro Arg Ser Met 
195 " 200 205 

gec gca ggc ggc aag ggc tgg tct ctg egg ggc gta aag aag gtg ctg 672 
Ala Ala Gly Gly Lys Gly Trp Ser Leu Arg Gly Val Lys Lys Val^ Leu 
210 ~ 215 220 

acc tec ccg cgc tac gtc ggg egg tec age tac etc ggg gag gtc gtg 720 
Thr Ser Pro Arg Tyr Val Gly Arg Ser Ser Tyr Leu Gly Glu Val Val 
225 230 235 240 

ggc gat get cag tgg ccg ccc ate etc gac ccg gac gtc tac tac ggg 7'68 
Gly Asp Ala Gin Trp Pro Pro lie Leu Asp Pro Asp Val Tyr Tyr Gly 
245 250 255 

gtc gtg gec ate ctg aac aac ccc gac cgc ttc age ggg ggc cct egg 816 
Val Val Ala He Leu Asn Asn Pro Asp Arg Phe Ser Gly Gly Pro Arg 
260 265 270 

acc ggc cgc acc ccc ggc acg ctg etc gca ggc ate gee ttg tgc ggt 864 
Thr Gly Arg Thr Pro Gly Thr Leu Leu Ala Gly He Ala Leu Cys Gly 
275 280 285 

gag tgc ggc aag acg gtc agt gga cgc ggc tac cga ggt gtc ctg gtc 912 
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Glu Cys Gly Lys Thr Val Ser Gly Arg Gly Tyr Arg Gly Val Leu Val 
290 295 300 

tac gga tgt aag gac acg cac act egg acg cct egg age ate get gac 960 
Tyr Gly Cys Lys Asp Thr His Thr Arg Thr Pro Arg Ser lie Ala Asp 
305 310 315 320 

ggc cgc gcg age age teg acc etc gee egg etc atg ttc ccc gac ttc 1008 
Gly Arg Ala Ser Ser Ser Thr Leu Ala Arg Leu Met Phe Pro Asp Phe 
325 330 335 

ctg ccc ggc etc ctg gec tct ggg cag gee gag gac ggc cag teg gca 105 6 
Leu Pro Gly Leu Leu Ala Ser Gly Gin Ala Glu Asp Gly Gin Ser Ala 
340 ~ 345 350 

gca tec aag cac teg gag gee cag acg ctg cgc gag cgc ctt gac ggg 1104 
Ala Ser Lys His Ser Glu Ala Gin Thr Leu Arg Glu Arg Leu Asp Gly 
355 . 360 ~ 365 

ctg get acg gec tac gcg gag ggt gcg ate age ctg tct cag atg acg 1152 
Leu Ala Thr Ala Tyr Ala Glu Gly Ala lie Ser Leu Ser Gin Met Thr 
370 375 380 

gee ggc teg gaa gca ctg egg aag aag ctg gag gtg ate gaa gec gac 1200 
Ala Gly Ser Glu Ala Leu Arg Lys Lys Leu Glu Val He Glu Ala Asp 
385 390 f 395 400 

etc gtg ggc teg gca ggc ate ccg ccc ttc gat cca gtg gec gga gtg 124 8 
Leu Val Gly Ser Ala Gly He Pro Pro Phe Asp Pro Val Ala Gly Val 
405 410 415 

get ggc ctg ate tec ggc tgg ccc acc acg cct etc ccg acg cgt cga 1296 
Ala Gly Leu He Ser Gly Trp Pro Thr Thr Pro Leu Pro Thr Arg Arg 
420 ^ 425 430 

gca tgg gtg gac ttc tgc ctg gtg gtc acg ctg aac acc cag aag ggg 1344 
Ala Trp Val Asp "Phe Cys Leu Val Val Thr Leu Asn Thr Gin Lys Gly 
435 440 445 

cgc cat gcg teg age atg acc gtg gac gac cac gtc acc ate gag tgg 1392 
Arg His Ala Ser Ser Met Thr Val Asp Asp His Val Thr He Glu Trp 
450 455 460 

cga gac gtg gec gag tag 1410 

Arg Asp Val Ala Glu 

465 

<210> 55 
<211> 469 
<212> PRT 

<213> Bacteriophage R4 
<400> 55 

Met Asn Arg Gly Gly Pro Thr Val Arg Ala Asp He Tyr Val Arg He 
15 10 15 

Ser Leu Asp Arg Thr Gly Glu Glu Leu Gly Val Glu Arg Gin Glu Glu 
20 25 30 

Ser Cys Arg Glu Leu Cys Lys Ser Leu Gly Met Glu Val Gly Gin Val 
35 40 45 

Trp Val Asp Asn Asp Leu Ser Ala Thr Lys Lys Asn Val Val Arg Pro 
50 55 60 

Asp Phe Glu Ala Met He Ala Ser Asn Pro Gin Ala He Val Cys Trp 
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65 70 75 80 

His Thr Asp Arg Leu lie Arg Val Thr Arg Asp Leu Glu Arg Val lie 
85 90 95 

Asp Leu Gly Val Asn Val His Ala Val Met Ala Gly His Leu Asp Leu 
100 105 110 

Ser Thr Pro Ala Gly Arg Ala Val Ala Arg Thr Val Thr Ala Trp Ala 
115 120 125 

Thr Tyr Glu Gly Glu Gin Lys Ala Glu Arg Gin Lys Leu Ala Asn lie 
130 135 140 

Gin Asn Ala Arg Ala Gly Lys Pro Tyr Thr Pro Gly lie Arg Pro Phe 
145 150 155 160 

Gly Tyr Gly Asp Asp His Met Thr lie Val Thr Ala Glu Ala Asp Ala 
165 170 175 

lie Arg Asp Gly Ala Lys Met He Leu Asp Gly Trp Ser Leu Ser Ala 
180 185 ~ ' 190 

Val Ala Arg Tyr Trp Glu Glu Leu Lys Leu Gin Ser Pro Arg Ser Met 
195 200 205 

Ala Ala Gly Gly Lys Gly Trp Ser Leu Arg Gly Val Lys Lys Val Leu 
210 " ' 215 220 

Thr Ser Pro Arg Tyr Val Gly Arg Ser Ser Tyr Leu Gly Glu Val Val 
225 230 235 240 

Gly Asp Ala Gin Trp Pro Pro He Leu Asp Pro Asp Val Tyr Tyr Gly 
245 250 255 

Val Val Ala lie Leu Asn Asn Pro Asp Arg Phe Ser Gly Gly Pro Arg 
260 265 270 

Thr Gly Arg' Thr Pro Gly Thr Leu Leu Ala Gly He Ala Leu Cys Gly 
275 280 285 

Glu Cys Gly Lys Thr Val Ser Gly Arg Gly Tyr Arg Gly Val Leu Val 
290 295 300 

Tyr Gly Cys Lys Asp Thr His Thr Arg Thr Pro Arg Ser He Ala Asp 
305 310 315 320 

Gly Arg Ala Ser Ser Ser Thr Leu Ala Arg Leu Met Phe Pro Asp Phe 
325 330 335 

Leu Pro Gly Leu Leu Ala Ser Gly Gin Ala Glu Asp Gly Gin Ser Ala 
340 345 " 350 

Ala Ser Lys His Ser Glu Ala Gin Thr Leu Arg Glu Arg Leu Asp Gly 
355 360 " . 365 

Leu Ala Thr Ala Tyr Ala Glu Gly Ala He Ser Leu Ser Gin Met Thr 
370 375 380 

Ala Gly Ser Glu Ala Leu Arg Lys Lys Leu Glu Val lie Glu Ala Asp 
385 390 395 400 

Leu Val Gly Ser Ala Gly He Pro Pro Phe Asp Pro Val Ala Gly Val 
405 410 415 

Ala Gly Leu He Ser Gly Trp Pro Thr Thr Pro Leu Pro Thr Arg Arg 
420 425 430 
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Ala Trp Val Asp Phe Cys Leu Val Val Thr Leu Asn Thr Gin Lys Gly 
435 " 440 445 

Arg His Ala Ser Ser Met Thr Val Asp Asp His Val Thr He Glu Trp 
450 455 460 

Arg Asp Val Ala Glu 
465 



<210> 56 
<211> 1503 
IS <212> DNA 

<213> CisA recombinase 

<220> 
<221> CDS 
20 <222> (1) - . (1500) 

<400> 56 

gtg ata gca ata tat gta agg gta teg acc gag gaa caa gcg ate aag 4 8 

Val lie Ala lie Tyr Val Arg Val Ser Thr Glu Glu Gin Ala lie Lys 

25 1 5 10 15 

gga teg age ate gac age caa ate gag gec tgt ata aag aaa gca ggg 96 
Gly Ser Ser lie Asp Ser Gin lie Glu Ala Cys lie Lys Lys Ala Gly 
20 25 30 

act aaa gat gtg ctg aag tat gca <jat gaa gga ttt tea gga gag ctt 14 4 
Thr Lys Asp Val Leu Lys Tyr Ala Asp Glu Gly Phe Ser Gly Glu Leu 
35 ^40 45 



35 tta gaa cgt ccg get ttg aat cgc ttg agg gag gat gca age aag gga 192 

Leu Glu Arg Pro Ala Leu Asn Arg Leu Arg Glu Asp Ala Ser Lys Gly 

50 55 60 

ctt ata agt caa gtc att tgt tac gat cct gac cgt ctt tct egg aaa 240 

40 Leu He Ser Gin Val He Cys Tyr Asp Pro- Asp Arg Leu Ser Arg Lys 

65 70 75 ~ 80 

tta atg aat cag eta ate att gat gac gaa ttg cga aag cga aac ata 288 

Leu Met Asn Gin Leu He He Asp Asp Glu Leu Arg Lys Arg Asn He 

45 85 90 95 

cct ttg att ttt gta aat ggt gaa tac gec aat tct cca gaa ggt caa 336 

Pro Leu He Phe Val Asn Gly Glu Tyr Ala Asn Ser Pro Glu Gly Gin 

100 105 110 



ttg ttt ttc gca atg cgc ggg gca ate tea gaa ttt gaa aaa gec aaa 38 4 
Leu Phe Phe Ala Met Arg Gly Ala He Ser Glu Phe Glu Lys Ala Lys 
115 120 125 



55 ate aaa gaa egg aca tea age ggc cga ctt caa aaa atg aaa aaa ggc 432 

He Lys Glu Arg Thr Ser Ser Gly Arg Leu Gin Lys Met Lys Lys Gly 

130 135 140 

atg ate att aaa gat tct aaa eta tat ggc tat aaa ttt gtt aaa gag 4 80 

60 Met He He Lys Asp Ser Lys Leu Tyr Gly Tyr Lys Phe Val Lys Glu 

145 " 150 ~ 155 160 

aaa aga act ctt gag ata tta gaa gag gaa gca aaa ate att egg atg 528 

Lys Arg Thr Leu Glu He Leu Glu Glu Glu Ala Lys He He Arg Met 

65 " 165 170 * 175 

att ttt aae tat ttc acc gat cat aaa age cct ttt ttc ggc aga gta 57 6 
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lie Phe Asn Tyr Phe Thr Asp His Lys Ser Pro Phe Phe Gly Arg Val 
180 185 190 

aat ggt att get eta cat tta act cag atg ggg gtt aaa aca aaa aaa 624 
Asn Gly lie Ala Leu His Leu Thr Gin Met Gly Val Lys Thr Lys Lys 
195 200 205 

ggc gec aaa gta tgg cac agg cag gtt gtt egg caa ata tta atg aac 672 
Gly Ala Lys Val Trp His Arg Gin Val Val Arg Gin lie Leu Met Asn 
210 215 . 220 

tct tec tat aag ggt gaa cat aga cag tat aaa tat gat aca gag ggt 720 
Ser Ser Tyr Lys Gly Glu His Arg Gin Tyr Lys Tyr Asp Thr Glu Gly 
225 230 235 240 

tec tat gtt tea aag cag gca ggg aac aaa tct ata att aaa ata agg 7 68 
Ser Tyr Val Ser Lys Gin Ala Gly Asn Lys Ser lie lie Lys lie. Arg 
245 250 255 

cct gaa gaa gaa caa ate act gtg aca att cca gca att gtt cca get 316 
Pro Glu Glu Glu Gin lie Thr Val Thr lie Pro Ala lie Val Pro Ala 
260 265 270 

gaa caa tgg gat tat get caa gaa etc tta ggt caa agt aaa aga aaa 864 
Glu Gin Trp Asp Tyr Ala Gin Glu Leu Leu Gly Gin Ser Lys Arg Lys 
275 230 285 

cac ttg agt ate age cct cac aat tac ttg tta teg ggt ttg gtt aga 912 
His Leu Ser lie Ser Pro His Asn Tyr Leu Leu Ser Gly Leu Val Arg 
290 295 300 

tgc gga aaa tgc gga aat acc atg aca ggg aag aaa aga aaa tea cat 960 
Cys Gly Lys Cys Gly Asn Thr Met Thr Gly Lys Lys Arg Lys Ser His 
305 " 310 " 315 320 

ggt aaa gac tac tat gta tat aqt tgc egg aaa aat tat tct ggc gca 1008 
Gly Lys Asp Tyr Tyr Val Tyr Thr Cys Arg Lys Asn Tyr Ser Gly Ala 
325 330 335 

aag gac cgc ggc tgc gga aaa gaa atg tct gag aat aaa ttg aac egg 1056 
Lys Asp Arg Gly Cys Gly Lys Glu Met Ser Glu Asn Lys Leu Asn Arg 
340 ~ ' 345 350 

cat gta tgg ggt gaa att ttt aaa ttc ate aca aat cct caa aag tat 1104 
His Val Trp Gly Glu lie Phe Lys Phe lie Thr Asn Pro Gin Lys Tyr 
355 360 365 

gtt tct ttt aaa gag get gaa caa tea aat cac ctg tct gat gaa tta 1152 
Val Ser Phe Lys Glu Ala Glu Gin Ser Asn His Leu Ser Asp Glu Leu 
370 ^ 375 380 

gaa ctt att gaa aaa gag ata gag aaa aca aaa aaa ggc cgc aag cgt 1200 
Glu Leu lie Glu Lys Glu He Glu Lys Thr Lys Lys Gly Arg Lys Arg 
385 390 395 400 

ctt tta acg eta ate age eta age gat gac gat gat tta gac ata gat 1248 
Leu Leu Thr Leu He Ser Leu Ser Asp Asp Asp Asp Leu Asp He Asp 
405 410 415 

gaa ate aaa gca caa att att gaa ctg caa aaa aag caa aat cag ctt 1296 
Glu He Lys Ala Gin He He Glu Leu Gin Lys Lys Gin Asn Gin Leu 
420 425 430 

act gaa aag tgt aac aga ate cag tea aaa atg aaa gtc eta gat gat 1344 
Thr Glu Lys Cys Asn Arg He Gin Ser Lys Met Lys Val Leu Asp Asp 
435 440 445 
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acg age tea agt gaa aat get eta aaa aga gee ate gac tat ttt caa 1392 
Thr Ser Ser Ser Glu Asn Ala Leu Lys Arg Ala He Asp Tyr Phe Gin 
450 455 460 

5 tea ate ggt gca gat aac tta act ctt gaa gat aaa aaa aca att gtt 1440 
Ser He Gly Ala Asp Asn Leu Thr Leu Glu Asp Lys Lys Thr He Val 
465 470 475 480 

aac ttt ate gtg aaa gaa gtt ace att gtg gat tct gac acc at a. tat 1488 
10 Asn Phe He Val Lys Glu Val Thr He Val Asp Ser Asp Thr He Tyr 

485 490 495 

att gaa acg tat taa 1503 
He Glu Thr Tyr 
15 500 



<210> 57 
<211> 500 
20 <212> PRT 

<213> CisA recombinase 

<400> 57 

Val He Ala He Tyr Val Arg Val Ser Thr Glu Glu Gin Ala lie Lys 
25 1 5 10 15 

Gly Ser Ser lie Asp Ser Gin He Glu Ala Cys He Lys Lys Ala Gly 
20 25 30 

30 Thr Lys Asp Val Leu Lys Tyr Ala Asp Glu Gly Phe Ser Gly Glu Leu 
35 40 45 

Leu Glu Arg Pro Ala Leu Asn Arg Leu Arg Glu Asp Ala Ser Lys Gly 
50 55 60 

35 

Leu He Ser Gin Val He Cys Tyr Asp Pro Asp Arg Leu Ser Arg Lys 
65 70 . 75 80 

Leu Met Asn Gin Leu He He Asp Asp Glu Leu Arg Lys Arg Asn He 
40 85 90 95 

Pro Leu He Phe Val Asn Gly Glu Tyr Ala Asn Ser Pro Glu Gly Gin 
100 105 110 

45 Leu Phe Phe Ala Met Arg Gly Ala He Ser Glu Phe Glu Lys Ala Lys 
115 120 125 

He Lys Glu Arg Thr Ser Ser Gly Arg Leu Gin Lys Met Lys Lys Gly 
130 135 140 

50 

Met He He Lys Asp Ser Lys Leu Tyr Gly Tyr Lys Phe Val Lys Glu 
145 150 155 160 

Lys Arg Thr Leu Glu He Leu Glu Glu Glu Ala Lys He He Arg Met 
55 165 170 175 

He Phe Asn Tyr Phe Thr Asp His Lys Ser Pro Phe Phe Gly Arg Val 
180 185 190 

60 Asn Gly He Ala Leu His Leu Thr Gin Met Gly Val Lys Thr £ys Lys 
195 200 205 

Gly Ala Lys Val Trp His Arg Gin Val Val Arg Gin He Leu Met Asn 
210 215 220 

65 

Ser Ser Tyr Lys Gly Glu His Arg Gin Tyr Lys Tyr Asp Thr Glu Gly 
225 230 235 240 
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Ser Tyr Val Ser Lys Gin Ala Gly Asn Lys Ser lie lie Lys He Arg 
245 250 255 

Pro Glu Glu Glu Gin lie Thr Val Thr He Pro Ala lie Val Pro Ala 
260 265 270 

Glu Gin Trp Asp Tyr Ala Gin Glu Leu Leu Gly Gin Ser Lys Arg Lys 
275 280 285 

His Leu Ser lie Ser Pro His Asn Tyr Leu Leu Ser Gly Leu Val Arg 
290 295 ~ 300 



Cys Gly Lys Cys Gly Asn Thr Met Thr Gly. Lys Lys Arg Lys Ser His 

15 305 310 315 320 

Gly Lys Asp Tyr Tyr Val Tyr Thr Cys Arg Lys Asn Tyr Ser Gly Ala 

325 330 335 

20 Lys Asp Arg Gly Cys Gly Lys Glu Met Ser Glu Asn Lys Leu Asn Arg 

340 345 350 



His Val Trp Gly Glu He Phe Lys Phe He Thr Asn Pro Gin Lys Tyr 
355 360 365 

Val Ser Phe Lys Glu Ala Glu Gin Ser Asn His Leu Ser Asp Glu Leu 
370 375 380 



Glu Leu He Glu Lys Glu He Glu Lys Thr Lys Lys Gly Arg Lys Arg 

30 385 390 395 400 

Leu Leu Thr Leu He Ser Leu Ser Asp Asp Asp Asp Leu Asp He Asp 

405 410 415 

35 Glu He Lys Ala Gin He He Glu Leu Gin Lys Lys Gin Asn Gin Leu 

420 425 430 



Thr Glu Lys Cys Asn Arg He Gin Ser Lys Met Lys Val Leu Asp Asp 
435 440 445 

Thr Ser Ser Ser Glu Asn Ala Leu Lys Arg Ala He Asp Tyr Phe Gin 
450 455 . 460 



Ser He Gly Ala Asp Asn Leu Thr Leu Glu Asp Lys Lys Thr He Val 

45 465 470 475 480 

Asn Phe He Val Lys Glu Val Thr He Val Asp Ser Asp Thr He Tyr 

4 85 490 4 95 

50 He Glu Thr Tyr 
500 



55 <210> 58 

<211> 1545 
<212> DNA 

<213> XisF recombinase 

60 <220> 

<221> CDS 

<222> (1) . . {1542) 

<400> 58 

65 atg gaa aat tgg ggt tac gcg aga gtg age ggt gag gaa cag caa aca 48 

Met Glu Asn Trp Gly Tyr Ala Arg Val Ser Gly Glu Glu Gin Gin Thr 
1 5 10 15 
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gat aaa ggt gcg ttg cgt aaa caa ata gaa cgc ttg cgt aat get gga 96 
Asp Lya Gly Ala Leu Arg Lys Gin lie Gin Arg Leu Arg Asn Ala Gly 
20 25 30 

5 

tgt tea aaa gtg tac tgg gat att caa teg egg aca act gaa gtc aga 144 
Cys Ser Lys Val Tyr Trp Asp lie Gin Ser Arg Thr Thr Glu Val Arg 
35 40 45 

10 gaa ggg eta caa caa tta att aat gac tta aag aca tct tea aca ggt 192 
Glu Gly Leu Gin Gin Leu lie Asn Asp Leu Lys Thr Ser Ser Thr Gly 
50 55 60 

aag gta aaa tea ctg caa ttt acc cgc att gat cgc ate ggc tea tea 240 
15 Lys Val Lys Ser Leu Gin Phe Thr Arg lie Asp Arg lie Gly Ser Ser 
65 70 75 80 

teg egg ttg ttt tat tea ttg tta gag gta tta cgt tec aag gga att 288 
Ser Arg Leu Phe Tyr Ser Leu Leu Glu Val Leu Arg Ser Lys Gly lie 
20 85 90 95 

aaa ctg ata gee tta gat caa ggc gtt gac cca gac age ctt ggc ggg 336 

Lys Leu lie Ala Leu Asp Gin Gly Val Asp Pro Asp Ser Leu Gly Gly 
100 105 110 

25 

gaa eta aca att gat atg tta ctg get get gee aaa ttt gag gta aga 384 

Glu Leu Thr lie Asp Met Leu Leu Ala Ala Ala Lys Phe Glu Val Arg 
115 120 125 

30 atg gtg acg gag agg tta aaa age gaa cgt cgt cat agg gtg aac caa 432 
Met Val Thr Glu Arg Leu Lys Ser Glu Arg Arg His Arg Val Asn Gin 
130 135 140 

gga aaa agt cac cga gtt gee cca tta gga tac cgc aaa gat aaa gat 4 80 
35 Gly Lys Ser His Arg Val Ala Pro Leu Gly Tyr Arg Lys Asp Lys Asp 
145 150 155 160 

aaa tat ata cgc gat cgc tea cca tgt gtt tgc tta eta gaa gga cgc 528 
Lys Tyr lie Arg Asp Arg Ser Pro Cys Val Cys Leu Leu Glu Gly Arg 
40 165 170 175 

aga gaa tta acg gtg tct gac tta gec cag tat att ttt cac act ttt 576 

Arg Glu Leu Thr Val Ser Asp Leu Ala Gin Tyr lie Phe His Thr Phe 

180 185 190 

45 

ttt gag tgc ggt tec gtt get get act gtg cgt aag ctg cac tea gat 624 

Phe Glu Cys Gly Ser Val Ala Ala Thr Val Arg Lys Leu His Ser Asp 

195 200 - 205 

50 ttt ggt ata gaa aca aaa gtt ctg aat tgg aac aag eta gaa aaa tct 672 
Phe Gly lie Glu Thr Lys Val Leu Asn Trp Asn Lys Leu Glu Lys Ser 
210 ^ 215 220 

tec egg att gtt ggc gac gac gac tta gat. aaa att gca ttt aca cca 720 
55 Ser Arg lie Val Gly Asp Asp Asp Leu Asp Lys lie Ala Phe Thr Pro 
225 230 235 240 

aat aaa act aac cac ccc ttg cgt tat ccc tgg tct ggg eta aga tgg 7 68 
Asn Lys Thr Asn His Pro Leu Arg Tyr Pro Trp Ser Gly Leu Arg Trp 
60 245 250 255 

tea ate cct ggt tta aaa gcg tta tta gtt aac cct gtt tac gee ggg 816 
Ser lie Pro Gly Leu Lys Ala Leu Leu Val Asn Pro Val Tyr Ala Gly 
260 265 270 

65 

ggt ttg ccc ttt gat act tac gtt aaa tea aaa gga aaa cgc aag cat 8 64 
Gly Leu Pro Phe Asp Thr Tyr Val Lys Ser Lys Gly Lys Arg Lys His 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 



275 



38 
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ttt gae gag tgg aaa gta aaa tgg gga acc cac gac gat gag gca ate 
Phe Asp Glu Trp Lys Val Lys Trp Gly Thr His Asp Asp Glu Ala lie 
290 295 300 



912 



att acc tgt gag gaa cat gaa aga ata aaa cag atg att cga gac aat 
lie Thr Cys Glu Glu His Glu Arg lie Lys Gin Met lie Arg Asp Asn 
305 ~ 310 315 320 



960 



cgc aat aat cga tgg get gca aga gaa gaa aac gaa gta aac cca ttt 
Arg Asn Asn Arg Trp Ala Ala Arg Glu Glu Asn Glu Val Asn Pro Phe 
325 330 335 



1008 



tct aat tta ctt aaa tgt acc cat tgc ggc ggc tea atg aca cgc cac 
Ser Asn Leu Leu Lys Cys Thr His Cys Gly Gly Ser Met Thr Arg His 
340 ^ 345 350 



1056 



gec aaa cgt gta gat aag agt gga caa get ate tat tat tat cag tgc 
Ala Lys Arg Val Asp Lys Ser Gly Gin Ala lie Tyr Tyr Tyr Gin Cys 
355 360 365 



1104 



cga ttg tat aaa get ggc aac tgt age aat aaa aat atg att tea tee 
Arg Leu Tyr Lys Ala Gly Asn Cys Ser Asn Lys Asn Met lie Ser Ser 
370 375 380 



1152 



aaa ata tta gat ate caa gta atg gat tta ttg gca caa gaa gec gaa 
Lys lie Leu Asp He Gin Val Met Asp Leu Leu Ala Gin Glu Ala Glu 
385 390 395 400 



1200 



cgt tta gca aat ttg gtg gaa aca gat gag ccg ctt att gta gaa gaa 
Arg Leu Ala Asn Leu Val Glu Thr Asp Glu Pro Leu He Val Glu Glu 
405 410 415 



1248 



ccc cca, gaa gta aaa acg ctg cgc gca tec ctg aat agt ctg gaa aca 
Pro Pro Glu Val Lys Thr Leu Arg Ala Ser Leu Asn Ser Leu Glu Thr 
420 425 430 



1296 



ttg cca gca agt tea gca att gaa caa att aaa aat gac etc aaa gaa 
Leu Pro Ala Ser Ser Ala He Glu Gin He Lys Asn Asp Leu Lys Glu 
435 440 445 . 



1344 



cag att gcg ate gca eta gga gca acc aat aat get tct aaa caa tct 
Gin He Ala He Ala Leu Gly Ala Thr Asn Asn Ala Ser Lys Gin Ser 
450 455 460 



1392 



ctg att gec aag gaa aga att ata caa get ttt get cat aaa agt tac 
Leu He Ala Lys Glu Arg He He Gin Ala Phe Ala His Lys Ser Tyr 
465 ' 470 475 480 



1440 



tgg caa gga eta aac get caa gat aaa cga gca ate etc aat ggt tge 
Trp Gin Gly Leu Asn Ala Gin Asp Lys Arg Ala He Leu Asn Gly Cys 
485 490 495 



1488 



gta aaa aaa ate tec gta gat ggt aac ttt gtt aca £ct att gag tat 
Val Lys Lys He Ser Val Asp Gly Asn Phe Val Thr Ala He Glu Tyr 
500 505 510 



1536 



cgt tac tag 
Arg Tyr 



1545 



<210> 59 
<2H> 514 
<212> PRT 

<213> XisF recombinase 
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<400> 59 

Met Glu Asn Trp Gly Tyr Ala Arg Val Ser Gly Glu Glu Gin Gin Thr 
15 10 15 

Asp Lys Gly Ala Leu Arg Lys Gin lie Glu Arg Leu Arg Asn Ala Gly 
20 " 25 30 

Cys Ser Lys Val Tyr Trp Asp He Gin Ser Arg Thr Thr Glu Val Arg 
35 40 45 

Glu Gly Leu Gin Gin Leu He Asn Asp Leu Lys Thr Ser Ser Thr Gly 
50 55 60 

Lys Val Lys Ser Leu Gin Phe Thr Arg He Asp Arg He Gly Ser Ser 
65 70 75 80 

Ser Arg Leu Phe Tyr Ser Leu Leu Glu Val Leu Arg Ser Lys Gly He 
85 90 95 

Lys Leu He Ala Leu Asp Gin Gly Val Asp Pro Asp Ser Leu Gly Gly 
100 105 110 

Glu Leu Thr He Asp Met Leu Leu Ala Ala Ala Lys Phe Glu Val Arg 
115 120 125 

Met Val Thr Glu Arg Leu Lys Ser Glu Arg Arg His Arg Val Asn Gin 
130 135 140 

Gly Lys Ser His Arg Val Ala Pro Leu Gly Tyr Arg Lys Asp Lys Asp 
' 145 150 155 ~ ' ~ 160 

Lys Tyr He Arg Asp Arg Ser Pro Cys Val Cys Leu Leu Glu Gly Arg 
165 170 175 

Arg Glu Leu Thr Val Ser Asp Leu Ala Gin Tyr He Phe His Thr Phe 
180 185 190 

Phe Glu Cys Gly Ser Val Ala Ala Thr Val Arg Lys Leu His Ser Asp 
195 200 205 

Phe Gly He Glu Thr Lys Val Leu Asn Trp Asn Lys Leu Glu Lys Ser 
210 215 220 

Ser Arg He Val Gly Asp Asp Asp Leu Asp Lys He Ala Phe Thr Pro 
225 230 235 240 

Asn Lys Thr Asn His Pro Leu Arg Tyr Pro Trp Ser Gly Leu Arg Trp 
245 " 250 255 

Ser He Pro Gly Leu Lys Ala Leu Leu Val Asn Pro Val Tyr Ala Gly 
260 265 270 

Gly Leu Pro Phe Asp Thr Tyr Val Lys Ser Lys Gly Lys Arg Lys His 
275 280 285 

Phe Asp Glu Trp Lys Val Lys Trp Gly Thr His Asp Asp Glu Ala He 
290 295 300 

He Thr Cys Glu Glu His Glu Arg He Lys Gin Met lie Arg Asp Asn 
305 ~ 310 315 320 

Arg Asn Asn Arg Trp Ala Ala Arg Glu Glu Asn Glu Val Asn Pro Phe 
325 330 335 

Ser Asn Leu Leu Lys Cys Thr His Cys Gly Gly Ser Met Thr Arg His 
340 " 345 350 
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Ala Lys Arg Val Asp Lys Ser Gly Gin Ala He Tyr Tyr Tyr Gin Cys 
355 360 365 

Arg Leu Tyr Lys Ala Gly Asn Cys Ser Asn Lys Asn Met lie Ser Ser 
370 375 380 

Lys lie Leu Asp lie Gin Val Met Asp Leu Leu Ala Gin Glu Ala Glu 
385 390 395 400 

Arg Leu Ala Asn Leu Val Glu Thr Asp Glu Pro Leu lie Val Glu Glu 
405 410 415 

Pro Pro Glu Val Lys Thr Leu Arg Ala Ser Leu Asn Ser Leu Glu Thr 
420 425 430 

Leu Pro Ala Ser Ser Ala He Glu Gin He Lys Asn Asp Leu Lys Glu 
435 440 445 

Gin He Ala He Ala Leu Gly Ala Thr Asn Asn Ala Ser Lys Gin Ser 
450 455 460 

Leu He Ala Lys Glu Arg He He Gin Ala Phe Ala His Lys Ser Tyr 
465 470 475 480 

Trp Gin Gly Leu Asn Ala Gin Asp Lys Arg Ala lie Leu Asn Gly Cys 
485 490 495 

Val Lys Lys He Ser Val Asp Gly Asn Phe Val Thr Ala lie Glu Tyr 
500 505 510 

Arg Tyr 



<210> 60 
<211> 2124 
<212> DNA 

<213> Transposon Tn4451 

<220> 

<221> CDS 

<222> (1) , . (2121) 

<400> 60 

atg tea agg act tea aga att aca gca ctt tac gag cgt ttg tea aga 48 

Met Ser Arg Thr Ser Arg He Thr Ala Leu Tyr Glu Axg Leu Ser Arg 
15 10 15 

gat gat gac ctt act ggc gag agt aat tct att acc aat caa aag aaa 96 
Asp Asp Asp Leu Thr Gly Glu Ser Asn Ser He Thr Asn Gin Lys Lys 
20 * 25 30 

tac etc gaa gat tat gec cgt agg aat ggt ttt gag aac att cgc cat 144 
Tyr Leu Glu Asp Tyr Ala Arg Arg Asn Gly Phe Glu Asn He Arg His 
35 40 45 

ttt acc gat gac gga ttt teg ggt gta aat ttc aat cgc cct ggc ttt 192 
Phe Thr Asp Asp Gly Phe Ser Gly Val Asn Phe Asn Arg Pro Gly Phe 
50 55 60 

caa tct ctg at a aaa gaa gtt gaa gca gga aat gta gaa acc ttg att 240 
Gin Ser Leu He Lys Glu Val Glu Ala Gly Asn Val Glu Thr Leu He 
65 70 75 80 

gtt aag gat atg age cga ttg ggg cga aat tat ctg caa gta ggt ttt 288 
Val Lys Asp Met Ser Arg Leu Gly Arg Asn Tyr Leu Gin Val Gly Phe 
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85 90 95 

tat acg gaa gtt ctg ttt cca cag aaa aat gtc cgt ttc ctt gca att 336 

Tyr Thr Glu Val Leu Phe Pro Gin Lys Asn Val Arg Phe Leu Ala He 
5 100 105 110 

aac aac agt att gac agt aac aac get teg gat aat gac ttt get cog 384 

Asn Asn Ser He Asp Ser Asn Asn Ala Ser Asp Asn Asp Phe Ala Pro 
115 120 125 

10 

ttt ttg aat att atg aac gaa tgg tat gec aaa gac aca age aac aaa 432 

Phe Leu Asn He Met Asn Glu Trp Tyr Ala Lys Asp Thr Ser Asn Lys 

130 135 140 

15 ate aag get at a ttc gat gee cgt atg aaa gac gga aag cgt tgt age 4 80 

lie Lys Ala lie Phe Asp Ala Arg Met Lys Asp Gly Lys Arg Cys Ser 

145 150 155 160 

ggt tct ate cct tat ggg tat aac cga ctg ccg age gac aaa caa acg 528 

20 Gly Ser He Pro Tyr Gly Tyr Asn Arg Leu Pro Ser Asp Lys Gin Thr 

165 170 175 

ctt gtg gtt gac cct gtg get teg gaa gtg gta aag cgt ate ttt act 576 

Leu Val Val Asp Pro Val Ala Ser Glu Val Val Lys Arg lie Phe Thr 
25 180 185 190 

ctt gec aat gat ggc aaa agt aca agg gca ate gca gaa ata ctg ace 624 

Leu Ala Asn Asp Gly Lys Ser Thr Arg Ala He Ala Glu He "Leu Thr 
195 200 205 

30 

gaa gaa aaa gtt tta acc cct gcg gca tac gca aag gaa tac cac ccc 672 

Glu Glu Lys Val Leu Thr Pro Ala Ala Tyr Ala Lys Glu Tyr His Pro 

210 215 220 

35 gaa cag tac aac ggc aac aag ttc aca aac cct tat ctt tgg gca atg 720 

Glu Gin Tyr Asn Gly Asn Lys Phe Thr Asn Pro Tyr Leu Trp Ala Met 

225 230 235 240 

tea acg ata aga aat att tta ggc agg cag gaa tat etc ggt cac acc 7 68 

40 Ser Thr He Arg Asn lie Leu Gly Arg Gin Glu Tyr Leu. Gly His Thr 

245 250 255 

gtt ttg cga aag teg gta age aca aat ttc aaa ctt cac aag aga aaa 816 

Val Leu Arg Lys Ser Val Ser Thr Asn Phe Lys Leu His Lys Arg Lys 
45 260 265 270 

age aca gac gaa gaa gaa cag tat gta ttt ccg aat aca cac gag cct 864 

Ser Thr Asp Glu Glu Glu Gin Tyr Val Phe Pro Asn Thr His Glu Pro 
275 280 285 



50 



ate ata teg cag gaa ctt tgg gac age gtt caa aaa cgc aga age aga 912 
He He Ser Gin Glu Leu Trp Asp Ser Val Gin Lys Arg Arg Ser Arg 
290 295 300 



55 gta aat cgt gec teg get tgg gga acg cac age aac cgt tta age gga 960 
Val Asn Arg Ala Ser Ala Trp Gly Thr His Ser Asn Arg Leu "Ser Gly 
305 310 315 320 

tat ttg tac tgt gec gat tgc gga aga aga atg act ttg cag aca cat 100B 
60 Tyr Leu Tyr Cys Ala Asp Cys Gly Arg Arg Met Thr Leu Gin Thr His 

325 330 335 

tac age aaa aaa gac ggt tct gtg cag tat tct tac cgt tgc ggt ggg 105 6 
Tyr Ser Lys Lys Asp Gly Ser Val Gin Tyr Ser Tyr Arg Cys Gly Gly 
65 340 345 ~ 350 

tat gca age aga gtg aac agt tgt acc agt cat teg att agt acc gat 1104 
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Tyr Ala Ser Arg Val Asn Ser Cys Thr Ser His Ser lie Ser Thr Asp 

355 360 365 

aat gtt gaa gcc ttg ata tta tea tct gtc aaa cgc ttt tea agg ttt 1152 

Asn Val Glu Ala Leu lie Leu Ser Ser Val Lys Arg Phe Ser Arg Phe 
370 375 380 

gtt ctg aat gat gaa caa gca ttt get ttg gaa ctg caa tct ctt tgg 1200 

Val Leu Asn Asp Glu Gin Ala Phe Ala Leu Glu Leu Gin Ser Leu Trp 
385 " 390 395 400 

aat gaa aaa cag gag gaa aag ccg aaa cac aat caa teg gaa ctg caa 1248 

Asn Glu Lys Gin Glu Glu Lys Pro Lys His Asn Gin Ser Glu Leu Gin 

405 ^ ~ 410 415 

cgc tgt cag aaa cgc tat gac gaa etc tct ace ctt gtt cgt ggc ttg 12 96 

Arg Cys Gin Lys Arg Tyr Asp Glu Leu Ser Thr Leu Val Arg Gly Leu 
420 425 430 

tat gaa aat ctt atg teg gga tta ctg ccc gaa aga cag tat aag caa 1344 

Tyr Glu Asn Leu Met Ser Gly Leu Leu Pro Glu Arg Gin Tyr Lys Gin . 

435 440 445 

ctg atg aaa cag tat gat gac gag cag gca gag ttg gaa acg aaa atg 1392 

Leu Met Lys Gin Tyr Asp Asp Glu Gin Ala Glu Leu Glu Thr Lys Met- 
450 455 460 

gaa acg atg aaa aca gaa ctt gcc gaa gaa aaa gta agt tec gtt gat 14 40 

Glu Thr Met Lys Thr Glu Leu Ala Glu Glu Lys Val Ser Ser Val Asp 
465 '470 475 480 

att aag cat ttc att teg ctg ata cgc aag tgt aaa aat cct acg gaa 1488 

He Lys His Phe He Ser Leu lie Arg Lys Cys Lys Asn Pro Thr Glu 

485 490 ^ 495 

ate tec gat aca atg ttt aat gaa ctt gtt gat aag ata gtg gtt tat 1536 

lie Ser Asp Thr Met Phe Asn Glu Leu Val Asp Lys He Val Val Tyr 
500 505 510 

gaa gca gag ggt gtg gga aaa gca cga aca caa aag gtc gat att tat 1584 

Glu Ala Glii Gly Val Gly Lys Ala Arg Thr Gin Lys Val Asp lie Tyr 

515 520 525 

ttt aac tat gtc ggt caa gtg gat att gcc tat acc gaa gaa gaa ctt 1632 

Phe Asn Tyr Val Gly Gin Val Asp He Ala Tyr Thr Glu Glu Glu Leu 
530 535 540 

gcc gag ata gaa aca cag aaa gag cag gag gaa cag caa cgc ttg gca 1680 

Ala Glu He Glu Thr Gin Lys Glu Gin Glu Glu Gin Gin Arg Leu Ala 
545 550 555 560 

aga cag cgc aag cgt gaa aaa gcc tac cga gaa aag cga aag gca cag 1728 

Arg Gin Arg Lys Arg Glu Lys Ala Tyr Arg Glu Lys Arg Lys Ala Gin 

565 570 575 

aaa ate get gaa aac ggt ggc gaa ate gtt aag aca aag gtt tgc cct 1776 

Lys He Ala Glu Asn Gly Gly Glu He Val Lys Thr Lys Val Cys Pro 
580 585 590 

cat tgc aac aaa gag ttt ate ccg aca age aac cga cag gtg ttc tgt 1824 

His Cys Asn Lys Glu Phe lie Pro Thr Ser Asn Arg Gin Val Phe Cys 

595 600 605 

tec aaa gag tgc tgc tat caa gca agg caa gac aaa aag aaa aca gac 1872 

Ser Lys Glu Cys Cys Tyr Gin Ala Arg Gin Asp Lys Lys Lys Thr Asp 
610 615 620 
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cga gaa gca gaa cga gga aat cac tat tac cga cag cgt gta tgt get 1920 

Arg Glu Ala Glu Arg Gly Asn His Tyr Tyr Arg Gin Arg Val Cys Ala 

625 630 635 640 

gtg tgc ggc aat tec tat tgg cct aca cac age caa cag aaa ttc tgc 1968 

Val Cys Gly Asn Ser Tyr Trp Pro Thr His Ser Gin Gin Lys Phe Cys 
645 650 655 



tec gaa gaa tgt caa agg gta aat cac aat aag aaa aca ttg gaa ttt 
Ser Glu Glu Cys Gin Arg Val Asn His Asn Lys - Lys Thr Leu Glu Phe 
660 665 ~ 670 



2016 



tac cac cat aaa aaa gaa aag gag aag ctg caa tgc aaa gat tta tea 
Tyr His His Lys Lys Glu Lys Glu Lys Leu Gin Cys Lys Asp Leu Ser 
675 680 685 



2064 



cag acg aaa gaa egg gta tec gat atg aac tta teg ggg act att act 
Gin Thr Lys Glu Arg Val Ser Asp Met Asn Leu Ser Gly Thr lie Thr 
690 695 700 



2112 



acc cct get taa 
Thr Pro Ala 
705 



2124 



<210> 61 
<211> 707 
<212> PRT 

<213> Transposon Tn4451 
<400> 61 

Met Ser Arg Thr Ser Arg lie Thr Ala Leu Tyr Glu Arg Leu Ser Arg 
15 10 15 

Asp Asp Asp Leu Thr Gly Glu Ser Asn Ser lie Thr Asn Gin Lys Lys 
20 25 30 

Tyr Leu Glu Asp Tyr Ala Arg Arg Asn Gly Phe Glu Asn lie Arg His 
35 40 45 

Phe Thr Asp Asp Gly Phe Ser Gly Val Asn Phe Asn Arg Pro Gly Phe 
50 55 60 

Gin Ser Leu lie Lys Glu Val Glu Ala Gly Asn Val Glu Thr Leu lie 
65 70 75 80 

Val Lys Asp Met Ser Arg Leu Gly Arg Asn Tyr Leu Gin Val Gly Phe 
85 90 95 

Tyr Thr Glu Val Leu Phe Pro Gin Lys Asn Val Arg Phe Leu Ala lie 
100 105 110 

Asn Asn Ser He Asp Ser Asn Asn Ala Ser Asp Asn Asp Phe Ala Pro 
115 120 125 

Phe Leu Asn He Met Asn Glu Trp Tyr Ala Lys Asp Thr Ser Asn Lys 
130 135 140 



He Lys Ala lie Phe Asp Ala Arg Met Lys Asp Gly Lys Arg Cys Ser 
145 150 155 ~ 160 

Gly Ser He Pro Tyr Gly Tyr Asn Arg Leu Pro Ser Asp Lys Gin Thr 
165 170 175 

Leu Val Val Asp Pro Val Ala Ser Glu Val Val Lys Arg He Phe Thr 
180 185 ' 190 
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Leu Ala Asn Asp Gly Lys Ser Thr Arg Ala He- Ala Glu lie Leu Thr 
195 200 205 

Glu Glu Lys Val Leu Thr Pro Ala Ala Tyr Ala Lys Glu Tyr His Pro 
210 215 220 

Glu Gin Tyr Asn Gly Asn Lys Phe Thr Asn Pro Tyr Leu Trp Ala Met 
225 " 230 ~ 235 240 

Ser Thr He Arg Asn lie Leu Gly Arg Gin Glu Tyr Leu Gly His Thr 
245 250 255 

Val Leu Arg Lys Ser Val Ser Thr Asn Phe Lys Leu His Lys Arg Lys 
260 265 270 

Ser Thr Asp Glu Glu Glu Gin Tyr Val Phe Pro Asn Thr His Glu Pro 
275 280 285 

lie lie Ser Gin Glu Leu Trp Asp Ser Val Gin Lys Arg Arg Ser Arg 
290 295 300 

Val Asn Arg Ala Ser Ala Trp Gly Thr His Ser Asn Arg Leu Ser Gly 
305 310 315 320 

Tyr Leu Tyr Cys Ala Asp Cys Gly Arg Arg Met Thr Leu Gin Thr His 
325 ~ ' 330 335 

Tyr Ser Lys Lys Asp Gly Ser Val Gin Tyr Ser Tyr Arg Cys Gly Gly 
340 345 350 

Tyr Ala Ser Arg Val Asn Ser Cys Thr Ser His Ser lie Ser Thr Asp 
355 " 360 365 

Asn Val Glu Ala Leu He Leu Ser Ser Val Lys Arg Phe Ser Arg Phe 
370 375 380 

Val Leu Asn Asp Glu Gin Ala Phe Ala Leu Glu Leu Gin Ser Leu Trp 
385 ~ 390 395 400 

Asn Glu Lys Gin Glu Glu Lys Pro Lys His Asn Gin Ser Glu Leu Gin 
405 410 415 

Arg Cys Gin Lys Arg Tyr Asp Glu Leu Ser Thr Leu Val Arg Gly Leu 
420 425 430 

Tyr Glu Asn Leu Met Ser Gly Leu Leu Pro Glu Arg Gin Tyr Lys Gin 
435 440 445 

Leu Met Lys Gin Tyr Asp Asp Glu Gin Ala Glu Leu' Glu Thr Lys Met 
450 455 4 60 

Glu Thr Met Lys Thr Glu Leu Ala Glu Glu Lys Val Ser Ser Val Asp 
465 470 475 480 

He Lys His Phe He Ser Leu He Arg Lys Cys Lys Asn Pro Thr Glu 
485 490 495 

He Ser Asp Thr Met Phe Asn Glu Leu Val Asp Lys lie Val Val Tyr 
500 505 510 

Glu Ala Glu Gly Val Gly Lys Ala Arg Thr Gin Lys Val Asp He Tyr 
515 520 525 

Phe Asn Tyr Val Gly Gin Val Asp He Ala Tyr Thr Glu Glu Glu Leu 
530 535 540 

Ala Glu He Glu Thr Gin Lys Glu Gin Glu Glu Gin Gin Arg Leu Ala 
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545 550 555 560 

Arg Gin Arg Lys Arg Glu Lys Ala Tyr Arg Glu Lys Arg Lys Ala Gin 
565 570 575 

5 

Lys lie Ala Glu Asn Gly Gly Glu lie Val Lys Thr Lys Val Cys Pro 
580 * 585 ^ 590 

His Cys Asn Lys Glu Phe He Pro Thr Ser Asn Arg Gin Val Phe Cys 
10 595 600 605 

Ser Lys Glu Cys Cys Tyr Gin Ala Arg Gin Asp Lys Lys Lys Thr Asp 
610 " 615 ' " 620 

15 Arg Glu Ala Glu Arg Gly Asn His Tyr Tyr Arg Gin Arg Val Cys Ala 
625 630 635 640 



20 



60 



Val Cys Gly Asn Ser Tyr Trp Pro Thr His Ser Gin Gin Lys Phe Cys 
645 650 655 

Ser Glu Glu Cys Gin Arg Val Asn His Asn Lys Lys Thr Leu Glu Phe 

660 665 670 



Tyr His His Lys Lys Glu Lys Glu Lys Leu Gin Cys Lys Asp Leu Ser 
25 675 680 685 

Gin Thr Lys Glu Arg Val Ser Asp Met Asn Leu Ser Gly Thr He Thr 
690 695 700 

30 Thr Pro Ala 
705 



35 <210> 62 

<211> 1420 
<212> DNA 

<213> XisA recombinase 

40 <220> 

<221> CDS 

<222> (1) . , (1416) 

<400> 62 

45 atg caa aat cag ggt caa gac aaa tat caa caa gcc ttt gca gac tta 48 

Met Gin Asn Gin Gly Gin Asp Lys Tyr Gin Gin Ala Phe Ala Asp Leu 

1 5 10 15 

gag cca ctt tea tct acc gac ggc agt ttt etc ggc tea agt ctg caa 96 
50 Glu Pro Leu Ser Ser Thr Asp Gly Ser Phe Leu Gly Ser Ser Leu Gin 

20 25 30 

gca cag cag caa aga gaa cac atg aga aca aaa gta eta caa gac eta 144 
Ala Gin Gin Gin Arg Glu His Met Arg Thr Lys Val Leu Gin Asp Leu 
55 35 ~ 40 45 

gac aag gta aat ctg cgt ttg aag tct gca aag acg aaa gtc tea gtt 192 
Asp Lys Val Asn Leu Arg Leu Lys Ser Ala Lys Thr Lys Val Ser Val 
50 55 ~ 60 



cga gaa tct aac gga agt ctg caa tta cga gca acg tta cca att aaa 240 
Arg Glu Ser Asn Gly Ser Leu Gin Leu Arg Ala Thr Leu Pro lie Lys 
65 70 75 80 



65 cct gga gat aag gac acc aac ggt aca ggc aga aag caa tac aat etc 288 
Pro Gly Asp Lys Asp Thr Asn Gly Thr Gly Arg Lys Gin Tyr Asn Leu 
85 90 ~ 95 
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age ttg aat ate cct gca aac ttg gat gga ctg aag acg get gag gaa 336 

Ser Leu Asm lie Pro Ala Asn Leu Asp Gly Leu Lys Thr Ala Glu Glu 

100 105 110 

gaa get tat gaa tta ggt aaa tta ate get egg aaa ace ttt gaa tgg 384 

Glu Ala Tyr Glu Leu Gly Lys Leu lie Ala Arg Lys Thr Phe Glu Trp 

115 120 125 

aat gat aaa tat tta ggc aaa gaa gee act aaa aaa gat tea caa aca 432 
Asn Asp Lys Tyr Leu Gly Lys Glu Ala Thr Lys Lys Asp Ser Gin Thr 

130 135 140 

at a ggt gafc tta eta gaa aaa ttt gca gaa gag tat ttt aaa acc cat 480 
lie Gly Asp Leu Leu Glu Lys Phe Ala Glu Glu Tyr Phe Lys Thr His 

145 150 155 160 

aaa cgc acc act aaa age gaa cat acc ttt ttt tac tat ttt tec cgc 528 

Lys Arg Thr Thr Lys Ser Glu His Thr Phe Phe Tyr Tyr Phe Ser Arg 

165 170 175 

acc caa cga tat acc aat tec aaa gat tta gca acg gcg gaa aat etc 576 
Thr Gin Arg Tyr Thr Asn Ser Lys Asp Leu Ala Thr Ala Glu Asn Leu 

180 185 190 

ate aat tea att gag caa ate gat aaa gaa tgg gcg aga tat aat gee 624 
He Asn Ser He Glu Gin He Asp Lys Glu Trp Ala Arg Tyr Asn Ala 

195 200 205 

gec aga gec ata tea get ttt tgc ata aca ttc aat ata gaa att gat 672 

Ala Arg Ala He Ser Ala Phe Cys He Thr Phe Asn He Glu lie Asp 

210 215 220 

ttg tec cag tat tec aaa atg cct gat cgc aat teg cgc aac ate ccc 720 

Leu Ser Gin Tyr Ser Lys Met Pro Asp Arg Asn Ser Arg Asn He Pro 

225 230 235 240 

aca gat gca gaa ata eta tea gga att acc aaa ttt gaa gac tat eta 7 68 

Thr Asp Ala Glu He Leu Ser Gly He Thr Lys Phe Glu Asp Tyr Leu 

245 250 255 

gtt acc aga gga aat caa gtt aat gaa gat gta aaa gat ago tgg caa 816 

Val Thr Arg Gly Asn Gin Val Asn Glu Asp Val Lys Asp Ser Trp Gin 

260 265 270 

ctt tgg cgc tgg aca tat gga atg tta gca gtt ttt ggt tta cgc ccc 864 

Leu Trp Arg Trp Thr Tyr Gly Met Leu Ala Val Phe Gly Leu Arg Pro 

275 " 280 285 

agg gaa att ttt att aac cct aat att gat tgg tgg tta age aaa gag 912 

Arg Glu lie Phe He Asn Pro Asn lie Asp Trp Trp Leu Ser Lys Glu 

290 295 300 

aat ata gac etc aca tgg aaa gta gac aaa gaa tgt aaa act ggt gaa 960 

Asn He Asp Leu Thr Trp Lys Val Asp Lys Glu Cys Lys Thr Gly Glu 

305 310 315 320 

aga caa gca tta ccc tta cat aaa gaa tgg att gat gag ttt gat tta 1008 

Arg Gin Ala Leu Pro Leu His Lys Glu Trp He Asp Glu Phe Asp Leu 

325 330 335 

aga aat ccg aaa tat tta gaa atg ctg gca aca gca att agt aaa aaa 1056 

Arg Asn Pro Lys Tyr Leu Glu Met Leu Ala Thr Ala He Ser Lys Lys 

340 345 350 

gat aaa aca aat cat get gaa ata aca gec tta act cag cgt att agt 1104 

Asp Lys Thr Asn His Ala Glu He Thr Ala Leu Thr Gin Arg He Ser 
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355 360 365 

tgg tgg ttt egg aaa gtc gaa tta gat ttt aaa ccc tat gat tta cgt 1152 
Trp Trp Phe Arg Lys Val Glu Leu Asp Phe Lys Pro Tyr Asp Leu Arg 
5 370 " 375 380 

cac gec tgg gca ate aga gcg cat att tta ggc ata cca ate aaa gcg 1200 

His Ala Trp Ala He Arg Ala His He Leu Gly lie Pro He Lys Ala 

385 390 395 400 

10 

gcg get gat aat ttg ggg cat agt atg cag gtt cat aca caa acc tat 1248 

Ala Ala Asp Asn Leu Gly His Ser Met Gin Val His Thr Gin Thr Tyr 

405 410 415 

15 cag cgc tgg ttc teg eta gat atg egg aag tta gcg att aat cag get 1296 
Gin Arg Trp Phe Ser Leu Asp Met Arg Lys Leu Ala He Asn Gin Ala 
420 425 430 

ttg act aag agg aat gaa ttt gag gtg att agg gag gag aat get aaa 1344 
20 Leu Thr Lys Arg Asn Glu Phe Glu Val He Arg Glu Glu Asn Ala Lys 
435 440 445 

ttg cag ata gaa aat gaa agg ttg agg atg gaa att gag aag tta aag 1392 
Leu Gin He Glu Asn Glu Arg Leu Arg Met Glu He Glu Lys Leu Lys 
25 450 455 460 

atg gaa ata get tat aag aat agt tgag 1420 
Met Glu He Ala Tyr Lys Asn Ser 
465 470 

30 

<210> 63 
<211> 472 
<212> PRT 
35 <213> XisA recombinase 

<400> 63 

Met Gin Asn Gin Gly Gin Asp Lys Tyr Gin Gin Ala Phe Ala Asp Leu 
15 10 15 

40 

Glu Pro Leu Ser Ser Thr Asp Gly Ser Phe Leu Gly Ser Ser Leu Gin 
20 25 30 

Ala Gin Gin Gin Arg Glu His Met Arg Thr Lys Val Leu Gin Asp Leu 
45 35 40 45 

Asp Lys Val Asn Leu Arg Leu Lys Ser Ala Lys Thr Lys Val Ser Val 
50 55 60 

50 Arg Glu Ser Asn Gly Ser Leu Gin Leu Arg Ala Thr Leu Pro He Lys" 
65 70 75 80 

.Pro Gly Asp Lys Asp Thr Asn Gly Thr Gly Arg Lys Gin Tyr Asn Leu 

85 90 95 

55 

Ser Leu Asn He Pro Ala Asn Leu Asp Gly Leu Lys Thr Ala Glu Glu 
100 105 ~ 110 

Glu Ala Tyr Glu Leu Gly Lys Leu He Ala Arg Lys Thr Phe Glu Trp 
60 115 - 120 " 125 

Asn Asp Lys Tyr Leu Gly Lys Glu Ala Thr Lys Lys Asp Ser Gin Thr 
130 135 140 

65 He Gly Asp Leu Leu Glu Lys Phe Ala Glu Glu Tyr Phe Lys Thr His 
145 150 155 160 
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Lys Arg Thr Thr Lys Ser Glu His Thr Phe Phe Tyr Tyr Phe Ser Arg 
165 170 175 

Thr Gin Arg Tyr Thr Asn Ser Lys Asp Leu Ala Thr Ala Glu Asn Leu 
180 185 190 

lie Asn Ser lie Glu Gin lie Asp Lys Glu Trp Ala Arg Tyr Asn Ala 
195 200 205 

Ala Arg Ala He Ser Ala Phe Cys He Thr Phe Asn He Glu lie Asp 
210 215 220 

Leu Ser Gin Tyr Ser Lys Met Pro Asp Arg Asn Ser Arg Asn lie Pro 
225 230 235 240 

Thr Asp Ala Glu He Leu Ser Gly He Thr Lys Phe Glu Asp Tyr Leu 
245 250 255 

Val Thr Arg Gly Asn Gin Val Asn Glu Asp Val Lys Asp Ser Trp Gin 
260 265 270 

Leu Trp Arg Trp Thr Tyr Gly Met Leu Ala Val Phe Gly Leu Arg Pro 
275 280 285 

Arg Glu He Phe He Asn Pro Asn He Asp Trp Trp Leu Ser Lys Glu 
290 295 300 

Asn He Asp Leu Thr Trp Lys Val Asp Lys Glu Cys Lys Thr Gly Glu 
305 310 315 320 

Arg Gin Ala Leu Pro Leu His Lys Glu Trp He Asp Glu Phe Asp Leu 
325 330 335 

Arg Asn Pro Lys Tyr Leu Glu Met Leu Ala Thr Ala He Ser Lys Lys 
340 345 350 

Asp Lys Thr Asn His Ala Glu He Thr Ala Leu Thr Gin Arg He Ser 
355 360 365 

Trp Trp Phe Arg Lys Val Glu Leu Asp Phe Lys Pro Tyr Asp Leu Arg 
370 375 380 

His Ala Trp Ala He Arg Ala His He Leu Gly He Pro He Lys Ala 
385 390 395 400 

Ala Ala Asp Asn Leu Gly His Ser Met Gin Val His Thr Gin Thr Tyr 
405 410 415 

Gin Arg Trp Phe Ser Leu Asp Met Arg Lys Leu Ala He Asn Gin Ala 
420 425 430 

Leu Thr Lys Arg Asn Glu Phe Glu Val He Arg Glu Glu Asn Ala Lys 
435 440 445 

Leu Gin lie Glu Asn Glu Arg Leu Arg Met Glu He Glu Lys Leu Lys 
450 455 460 

Met Glu He Ala Tyr Lys Asn Ser 
465 470 



<210> 64 
<211> 1008 
<212> DMA 

<213> Artificial Sequence 
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<220> 

<221> CDS 

<222> (1) . . (1005} 

<220> 

<223> Description of Artificial Sequence: vector 
pBS~SSV3 

<400> 64 

atg acg aaa gat aag acg cgt tat aaa tac ggg gat tat att tta cgc 4 8 

Met Thr Lys Asp Lys Thr Arg Tyr Lys Tyr Gly Asp Tyr lie Leu Arg 

1 5 ' 10 " 15 

gag agg aaa ggg egg tat tat gtt tac aag eta gag tat gaa aac ggt 96 

Glu Arg Lys Gly Arg Tyr Tyr Val Tyr Lys Leu Glu Tyr Glu Asn Gly 
20 25 30 

gag gta aaa gag cgt tac gtg ggt cct tta get gac gtc gtt gaa tea 144 

Glu Val Lys Glu Arg Tyr Val Gly Pro Leu Ala Asp Val Val Glu Ser 
35 40 " 45 

tat eta aaa atg aaa tta ggg gtc gta ggg gat act ccc eta caa gcg 192 

Tyr Leu Lys Met Lys Leu Gly Val Val Gly Asp Thr Pro Leu Gin Ala 

50 " .55 " 60 

gat ccc ccc ggt ttc gag ccc ggg aca age gga age ggt ggt gga aaa 240 

Asp Pro Pro Gly Phe Glu Pro Gly Thr Ser Gly Ser Gly Gly Gly Lys 
65 70 75 80 

gag gga act gaa cga cgt aaa ata gcg ttg gtt gec aat ttg cgc caa 288 

Glu Gly Thr Glu Arg Arg Lys lie Ala Leu Val Ala Asn Leu Arg Gin 
85 90 95 

tac gcg acg gac ggc aac ata aag gcg ttc tac aac tat etc atg aac 336 

Tyr Ala Thr Asp Gly Asn He Lys Ala Phe Tyr Asn Tyr Leu Met Asn 
100 1 105 J 110 

gaa agg ggg ata age gaa aaa act gca aag gac tac ate aat get ata 384 

Glu Arg Gly lie Ser Glu Lys Thr Ala Lys Asp Tyr He Asn Ala He 
115 120 125 

tea aag ccg tat aaa gag acg aga gac gca cag aag get tac cga etc 432 

Ser Lys Pro Tyr Lys Glu Thr Arg Asp Ala Gin Lys Ala Tyr Arg Leu 

130 135 140 

ttt gca cgt ttc tta gcg tea cgc aat ate ata cat gat gaa ttt gcg 480 

Phe Ala Arg Phe Leu Ala Ser Arg Asn lie He His Asp Glu Phe Ala 
145 150 ~ 155 ~ 160 

gat aaa ata ttg aaa gcg gta aag gtg aag aag . gcg aac get gat ate 528 

Asp Lys He Leu Lys Ala Val Lys Val Lys Lys Ala Asn Ala Asp He 
165 170 175 

tac att cca acg ttg gaa gag ata aaa agg acg tta caa tta gca aaa 576 

Tyr He Pro Thr Leu Glu Glu He Lys Arg Thr Leu Gin Leu Ala Lys 
180 185 190 

gac tat age gaa aac gtc tac ttc ate tac cgt ate get etc gag teg 624 

Asp Tyr Ser Glu Asn Val Tyr Phe He Tyr Arg He Ala Leu Glu Ser 
195 200 205 

ggc gtt agg ctg age gaa ata ctg aaa gtg ctg aag gaa ccc gaa agg 672 

Gly Val Arg Leu Ser Glu He Leu Lys Val Leu Lys Glu Pro Glu Arg 

210 215 220 

gac att tgc ggt aac gac gtc tgt tat tat ccg ctt agt tgg act agg 720 

Asp He Cys Gly Asn Asp Val Cys Tyr Tyr Pro Leu Ser Trp Thr Arg 
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225 230 235 240 

gga tat aag .ggc gtc ttc tat gta ttc cac ata acg cct ctg aag aga 7 68 
Gly Tyr Lys Gly Val Phe Tyr Val Phe His lie Thr Pro Leu Lys Arg 
245 250 255 

gta gag gtg acg aag tgg gca ata gcg gac ttt gaa cga cgt cat aag 816 
Val Glu Val Thr Lys Trp Ala He Ala Asp Phe Glu Arg Arg His Lys 
260 265 270 

gac get ata gcg ata aag tac ttc cgc aaa ttc gta gcg tct aag atg 8 64 
Asp Ala lie Ala lie Lys Tyr Phe Arg Lys Phe Val Ala Ser Lys Met 
275 280 ' 285 

get gag eta age gta ccg tta gat att ate gat ttt att caa ggg cgt 912 
Ala Glu Leu Ser Val Pro Leu Asp He He Asp Phe He Gin Gly Arg 
290 295 300 

aaa ccg aca cgc gtt tta acg caa cat tac gta teg etc ttc ggc ata 960 
Lys Pro Thr Arg Val Leu Thr Gin His Tyr Val Ser Leu Phe Gly He 
305 310 315 320 

gcg aaa gag caa tat aaa aag tat gcg gaa. tgg eta aaa ggg gtc tga 1008 
Ala Lys Glu Gin Tyr Lys Lys Tyr Ala Glu Trp Leu Lys Gly Val - 
325 330 335 

<210> 65 
<211> 335 
<212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence: vector 
pBS-SSV3 

<400> 65 

Met Thr Lys Asp Lys Thr Arg Tyr Lys Tyr Gly Asp Tyr He Leu Arg 
1 5 " 10 " 15 

Glu Arg Lys Gly Arg Tyr Tyr Val Tyr Lys Leu Glu Tyr Glu Asn Gly 
20 25 30 

Glu Val Lys Glu Arg Tyr Val Gly Pro Leu Ala Asp Val Val Glu Ser 
35 40 45 

Tyr Leu Lys Met Lys Leu Gly Val Val Gly Asp Thr Pro Leu Gin Ala 
50 55 60 

Asp Pro Pro Gly Phe Glu Pro Gly Thr Ser Gly Ser Gly Gly Gly Lys 
65 70 75 80 

Glu Gly Thr Glu Arg Arg Lys He Ala Leu Val Ala Asn Leu Arg Gin 
85 90 95 

Tyr Ala Thr Asp Gly Asn lie Lys Ala Phe Tyr Asn Tyr Leu Met Asn 
100 105 110 

Glu Arg Gly He Ser Glu Lys Thr Ala Lys Asp Tyr He Asn Ala He 
115 120 " 125 

Ser Lys Pro Tyr Lys Glu Thr Arg Asp Ala Gin Lys Ala Tyr Arg Leu 
130 135 140 

Phe Ala Arg Phe Leu Ala Ser Arg Asn He He His Asp Glu Phe Ala 
145 150 155 160 

Asp Lys He Leu Lys Ala Val Lys Val Lys Lys Ala Asn Ala Asp He 
165 170 175 
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Tyr lie Pro Thr Leu Glu Glu He Lys Arg Thr Leu Gin Leu Ala Lys 
180 185 190 

Asp Tyr Ser Glu Asn Val Tyr Phe He Tyr Arg He Ala Leu Glu Ser 
195 200 205 

Gly Val Arg Leu Ser Glu He Leu Lys Val Leu Lys Glu Pro Glu Arg 
210 215 220 

Asp He Cys Gly Asn Asp Val Cys Tyr Tyr Pro Leu Ser Trp Thr Arg 
225 230 235 240 

Gly Tyr Lys Gly Val Phe Tyr Val Phe His He Thr Pro Leu Lys Arg 
245 250 255 

Val Glu Val Thr Lys Trp Ala He Ala Asp Phe Glu Arg Arg His Lys 
260 265 270 

Asp Ala He Ala He Lys Tyr Phe Arg Lys Phe Val Ala Ser Lys Met 
275 280 285 

Ala Glu Leu Ser Val Pro Leu Asp He He Asp Phe He Gin Gly Arg 
290 295 . 300 

Lys Pro Thr Arg Val Leu Thr Gin His Tyr Val Ser Leu Phe Gly He 
305 310 ~ 315 320 

Ala Lys Glu Gin Tyr Lys Lys Tyr Ala Glu Trp Leu Lys Gly Val 
325 330 335 



<210> 66 
<211> 1441 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein NLS^XisA 

<220> 

<221> CDS 

<222> (1) . . (1437) 

<400> 66 

atg ccc aag aag aag agg aag gtg caa aat cag ggt caa gac aaa tat 4 8 

Met Pro Lys Lys Lys Arg Lys Val Gin Asn Gin Gly Gin Asp Lys Tyr 
15 10 15 

caa caa gcc ttt gca gac tta gag cca ctt tea tct acc gac ggc agt 96 
Gin Gin Ala Phe Ala Asp Leu Glu Pro Leu Ser Ser Thr Asp Gly Ser 
20 25 30 

ttt etc ggc tea agt ctg caa gca cag cag caa aga gaa cac atg aga 14 4 
Phe Leu Gly Ser Ser Leu Gin Ala Gin Gin Gin Arg Glu His Met Arg 
35 40 45 

aca aaa gta eta caa gac eta gac aag gta aat ctg cgt ttg aag tct 192 
Thr Lys Val Leu Gin Asp Leu Asp Lys Val Asn Leu Arg Leu Lys Ser 
50 55 60 

gca aag acg aaa gtc tea gtt cga gaa tct aac gga agt ctg caa tta '24 0 
Ala Lys Thr Lys Val Ser Val Arg Glu Ser Asn Gly Ser Leu Gin Leu 
65 ~ 70 75 80 
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cga gca acg tta cca att aaa cct gga gat aag gac acc aac ggt aca 288 

Arg Ala Thr Leu Pro He Lys Pro Gly Asp Lys Asp Thr Asn Gly Thr 

85 90 95 

ggc aga aag caa tac aat etc age ttg aat ate- cct gca aac ttg gat 336 

Gly Arg Lys Gin Tyr Asn Leu Ser Leu Asn lie Pro Ala Asn Leu Asp 

100 105 110 

gga ctg aag acg get gag gaa gaa get tat gaa tta ggt aaa tta ate 38 4 

Gly Leu Lys Thr Ala Glu Glu Glu Ala Tyr Glu Leu Gly Lys Leu lie 

115 120 " 125 

get egg aaa acc ttt gaa tgg aat gat aaa tat tta ggc aaa gaa gee 432 

Ala Arg Lys Thr Phe Glu Trp Asn Asp Lys Tyr Leu Gly Lys Glu Ala 

130 135 140 

act aaa aaa gat tea caa aca ata ggt gat tta eta gaa aaa ttt gca 480 

Thr Lys Lys Asp Ser Gin Thr lie Gly Asp Leu Leu Glu Lys Phe Ala 

145 150 155 160 

gaa gag tat ttt aaa acc cat aaa cgc acc act aaa age gaa cat acc 528 

Glu Glu Tyr Phe Lys Thr His Lys Arg Thr Thr Lys Ser Glu His Thr 

165 170 175 

ttt ttt tac tat ttt tec cgc acc caa cga tat acc aat tec aaa gat 57 6 

Phe Phe Tyr Tyr Phe Ser Arg Thr Gin Arg Tyr Thr Asn Ser Lys Asp 

180 ~ 185 190 

tta gca acg gcg gaa aat etc ate aat tea att gag caa ate gat aaa 624 

Leu Ala Thr Ala Glu Asn Leu lie Asn Ser He Glu Gin He Asp Lys 

195 200 205 

gaa tgg gcg aga tat aat gec gec aga gec ata tea get ttt tgc ata 672 

Glu Trp Ala Arg Tyr Asn Ala Ala Arg Ala He Ser Ala Phe Cys He 

210 215 220 

aca ttc aat ata gaa att gat ttg tec cag tat tec aaa atg cct gat 720 

Thr Phe Asn He Glu He Asp Leu Ser Gin Tyr Ser Lys Met Pro Asp 

225 230 235 240 

cgc aat teg cgc aac ate ccc aca gat gca gaa ata eta tea gga att 7 68 

Arg Asn Ser Arg Asn lie Pro Thr Asp Ala Glu He Leu Ser Gly lie 

245 250 255 

acc aaa ttt gaa gac tat eta gtt acc aga gga aat caa gtt aat gaa 816 

Thr Lys Phe Glu Asp Tyr Leu Val Thr Arg Gly Asn Gin Val Asn Glu 

260 " 265 ~ 270 

gat gta aaa gat age tgg caa ctt tgg cgc tgg aca tat gga atg tta 8 64 

Asp Val Lys Asp Ser Trp Gin Leu Trp Arg Trp Thr Tyr Gly Met Leu 

275 280 285 

gca gtt ttt ggt tta cgc ccc agg gaa att ttt att aac cct aat att 912 

Ala Val Phe Gly Leu Arg Pro Arg Glu He Phe He Asn Pro Asn He 

290 295 300 

gat tgg tgg tta age aaa gag aat ata gac etc aca tgg aaa gta gac 960 

Asp Trp Trp Leu Ser Lys Glu Asn He Asp Leu Thr Trp Lys Val Asp 

305 310 315 *" 320 

aaa gaa tgt aaa act ggt gaa aga caa gca tta ccc tta cat aaa gaa 1008 

Lys Glu Cys Lys Thr Gly Glu Arg Gin Ala Leu Pro Leu His Lys Glu 

325 330 335 

tgg att gat gag ttt gat tta aga aat ccg aaa tat tta gaa atg ctg 1056 

Trp He Asp Glu Phe Asp Leu Arg Asn Pro Lys Tyr Leu Glu Met Leu 

34 0 345 ~ 350 
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gca aca gca att agt aaa aaa gat aaa aca aat cat get gaa ata aca 1104 

Ala Thr Ala lie Ser Lys Lys Asp Lys Thr Asn His Ala Glu lie Thr 
355 360 365 

gec tta act cag cgt att agt tgg tgg ttt egg aaa gtc gaa tta gat 1152 

Ala Leu Thr Gin Arg lie Ser Trp Trp Phe Arg Lys Val Glu Leu Asp 
370 375 380 

ttt aaa ccc tat gat tta cgt cac gec tgg gca ate aga gcg cat att 1200 

Phe Lys Pro Tyr Asp Leu Arg His Ala' Trp Ala He Arg Ala His He 
385 390 395 400 

tta ggc ata cca ate aaa gcg gcg get gat aat ttg ggg cat agt atg 1248 

Leu Gly He Pro He Lys Ala Ala Ala Asp Asn Leu Gly His Ser Met 
405 410 415 

cag gtt cat aca caa acc tat cag cgc tgg ttc teg eta gat atg egg 1296 

Gin Val His Thr Gin Thr Tyr Gin Arg Trp Phe Ser Leu Asp Met Arg 
420 425 430 

aag tta gcg att aat cag get ttg act aag agg aat gaa ttt gag gtg 1344 

Lys Leu Ala He Asn Gin Ala Leu Thr Lys Arg Asn Glu Phe Glu Val 
435 440 445 

att agg gag gag aat get aaa ttg cag ata gaa aat gaa agg ttg agg 1392 

He Arg Glu Glu Asn Ala Lys Leu Gin He Glu Asn Glu Arg Leu Arg 
450 455 460 

atg gaa att gag aag tta aag atg gaa ata get tat aag aat agt tg.ag 1441 

Met Glu He Glu Lys Leu Lys Met Glu lie Ala Tyr Lys Asn Ser 
465 470 475 



<210> 67 
<211> 479 
<212> PRT 

<213> Artificial Sequence 

<223> Description of Artificial Sequence; DNA sequence 
coding for fusion protein NLS-XisA 

<400> 67 

Met Pro Lys Lys Lys Arg Lys Val Gin Asn Gin Gly Gin Asp Lys Tyr 
15 10 15 

Gin Gin Ala Phe Ala Asp Leu Glu Pro Leu Ser Ser Thr Asp Gly Ser 
20 25 30 

Phe Leu Gly Ser Ser Leu Gin Ala Gin Gin Gin Arg Glu His Met Arg 
35 40 45 

Thr Lys Val Leu Gin Asp Leii Asp Lys Val Asn Leu Arg Leu Lys Ser 
50 55 60 

Ala Lys Thr Lys Val Ser Val Arg Glu Ser Asn Gly Ser Leu Gin Leu 
65 70 75 80 

Arg Ala Thr Leu Pro He Lys Pro Gly Asp Lys Asp Thr Asn Gly Thr 
85 90 ~ 95 

Gly Arg Lys Gin Tyr Asn Leu Ser Leu Asn He Pro Ala Asn Leu Asp 
100 105 110 

Gly Leu Lys Thr Ala Glu Glu Glu Ala Tyr Glu Leu Gly Lys Leu He 
115 120 125 

Ala Arg Lys Thr Phe Glu Trp Asn Asp Lys Tyr Leu Gly Lys Glu Ala 
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130 135 140 

Thr Lys Lys Asp Ser Gin Thr He Gly Asp Leu Leu Glu Lys Phe Ala 
145 150 155 160 

Glu Glu Tyr Phe Lys Thr His Lys Arg Thr Thr Lys Ser Glu His Thr 
165 170 175 

Phe Phe Tyr Tyr Phe Ser Arg Thr Gin Arg Tyr Thr Asn Ser Lys Asp 
180 185 190 

Leu Ala Thr Ala Glu Asn Leu He Asn Ser He Glu Gin lie Asp Lys 
195 200 205 

Glu Trp Ala Arg Tyr Asn Ala Ala Arg Ala He Ser Ala Phe Cys lie 
210 215 220 

Thr Phe Asn He Glu He Asp Leu Ser Gin Tyr Ser Lys Met Pro Asp 
225 230 235 ' 240 

Arg Asn Ser Arg Asn He Pro Thr Asp Ala Glu He Leu Ser Gly He 
245 250 255 

Thr Lys Phe Glu Asp Tyr Leu Val Thr Arg Gly Asn Gin Val Asn Glu 
260 265 270 

Asp Val Lys Asp Ser Trp Gin Leu Trp Arg Trp Thr Tyr Gly Met Leu 
275 280 ~ 285 

Ala Val Phe Gly Leu Arg Pro Arg Glu He Phe He Asn Pro Asn lie 
290 295 300 

Asp Trp Trp Leu Ser Lys Glu Asn He Asp Leu Thr Trp Lys Val Asp 
305 310 315 320 

Lys Glu Cys Lys Thr Gly Glu Arg Gin Ala Leu Pro Leu His Lys Glu 
325 330 335 

Trp He Asp Glu Phe Asp Leu Arg Asn Pro Lys Tyr Leu Glu Met Leu 
340 345 350 

Ala Thr Ala He Ser Lys Lys Asp Lys Thr Asn His Ala Glu He Thr 
355 360 365 

Ala Leu Thr Gin Arg He Ser Trp Trp Phe Arg Lys Val Glu Leu Asp 
370 375 380 

Phe Lys Pro Tyr Asp Leu Arg His Ala Trp Ala He Arg Ala His He 
385 390 395 400 

Leu Gly He Pro He Lys Ala Ala Ala Asp Asn Leu Gly His Ser Met 
405 410 ^ 415 

Gin Val His Thr Gin Thr Tyr Gin Arg Trp Phe Ser Leu Asp Met Arg 
420 425 430 

Lys Leu Ala He Asn Gin Ala Leu Thr Lys Arg Asn Glu Phe Glu Val 
435 440 445 

He Arg Glu Glu Asn Ala Lys Leu Gin He Glu Asn Glu Arg Leu Arg 
450 455 460 

Met Glu He Glu Lys Leu Lys Met Glu He Ala Tyr Lys Asn Ser 
465 470 475 
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<210> 68 
<211> 1029 
<212> DNA 
^ <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: DNA sequence 
coding for fusion protein NLS^Ssv 

10 <220> 

<221> CDS 
. <222> (1) . . (1026) 

<400> 68 

15 atg ccc aag aag aag agg aag gtg acg aaa gat aag acg cgt tat aaa 48 

Met Pro Lys Lys Lys Arg Lys Val Thr Lys Asp . Lys Thr Arg Tyr Lys 
15 10 15 

tac ggg gat tat att tta cgc gag agg aaa ggg egg tat tat gtt tac 96 
20 Tyr Gly Asp Tyr lie Leu Arg Glu Arg Lys Gly Arg Tyr Tyr' Val Tyr 

20 25 30 

aag eta gag tat gaa aac ggt gag gta aaa gag cgt tac gtg ggt cct 144 
Lys Leu Glu Tyr Glu Asn Gly Glu Val Lys Glu Arg Tyr Val Gly Pro 
25 35 40 45 

tta get gac gtc gtt gaa tea tat eta aaa atg aaa tta ggg gtc gta 192 

Leu Ala Asp Val Val Glu Ser Tyr Leu Lys Met Lys Leu Gly Val Val 

50 55 60 

30 

ggg gat act ccc eta caa gcg gat ccc ccc ggt ttc gag ccc ggg aca 240 

Gly Asp Thr Pro Leu Gin Ala Asp Pro Pro Gly Phe Glu Pro Gly Thr 

65 70 75 80 

35 age gga age ggt ggt gga aaa gag gga act gaa cga cgt aaa at a gcg 288 
Ser Gly Ser Gly Gly Gly Lys Glu Gly Thr Glu Arg Arg Lys lie Ala 
85 90 95 

ttg gtt gec aat ttg cgc caa tac gcg acg gac ggc aac ata aag gcg ■ 336 
40 Leu Val Ala Asn Leu Arg Gin Tyr Ala Thr Asp Gly Asn lie Lys Ala 
100 105 iio 

ttc tac aac tat etc atg aac gaa agg ggg ata age gaa aaa act gca 384 
Phe Tyr Asn Tyr Leu Met Asn Glu Arg Gly lie Ser Glu Lys Thr Ala 
45 115 120 125 

aag gac tac ate aat get ata tea aag ccg tat aaa gag acg aga gac 432 

Lys Asp Tyr lie Asn Ala lie Ser Lys Pro Tyr Lys Glu Thr Arg Asp 
130 135 140 

50 

gca cag aag get tac cga etc ttt gca cgt ttc tta gcg tea cgc aat 4 80 

Ala Gin Lys Ala Tyr Arg Leu Phe Ala Arg Phe Leu Ala Ser Arg Asn 
145 150 155 160 

55 ate ata cat gat gaa ttt gcg gat aaa ata ttg aaa gcg gta aag gtg 528 
lie lie His Asp Glu Phe Ala Asp Lys lie Leu Lys Ala Val Lys Val 
165 170 175 

aag aag gcg aac get gat ate tac att cca acg ttg gaa gag ata aaa 576 
60 Lys Lys Ala Asn Ala Asp lie Tyr He Pro Thr Leu Glu Glu He Lys 
180 185 190 

agg acg tta caa tta gca aaa gac tat age gaa aac gtc tac ttc ate 624 
Arg Thr Leu Gin Leu Ala Lys Asp Tyr Ser Glu Asn Val Tyr Phe He 
65 195 200 205 

tac cgt ate get etc gag teg ggc gtt agg ctg age gaa ata ctg aaa 672 
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Tyr Arg lie Ala Leu Glu Ser Gly Val Arg Leu Ser Glu lie Leu Lys 
210 215 220 

gtg ctg aag gaa ccc gaa agg gac att tgc ggt aac gac gtc tgt tat 720 

Val Leu Lys Glu Pro Glu Arg Asp lie Cys Gly Asn Asp Val Cys Tyr 

225 230 ~ 235 240 

tat ccg ctt agt tgg act agg gga tat aag ggc gtc ttc tat gta ttc 7 68 

Tyr Pro Leu Ser Trp Thr Arg Gly Tyr Lys Gly Val Phe Tyr Val Phe 
245 250 255 

cac ata acg cot ctg aag aga gta gag gtg acg aag tgg gca ata gcg 816 

His lie Thr Pro Leu Lys Arg Val Glu Val Thr Lys Trp Ala lie Ala 
260 265 270 



gac ttt gaa cga cgt cat aag gac get ata gcg ata aag tac ttc cgc 
Asp Phe Glu Arg Arg His Lys Asp Ala lie Ala lie Lys Tyr Phe Arg 
275 280 285 



8 64 



aaa ttc gta gcg tct aag atg get gag eta age gta ccg tta gat att 912 
Lys Phe Val Ala Ser Lys Met Ala Glu Leu Ser Val Pro Leu Asp lie 
290 . 295 300 

ate gat ttt att caa ggg cgt aaa ccg aca cgc gtt tta acg caa cat 960 
lie Asp Phe He Gin Gly Arg Lys Pro Thr Arg Val Leu Thr Gin His 
305 310 315 320 

tac gta teg etc ttc ggc ata gcg aaa gag caa tat aaa aag tat gcg 1008 
Tyr Val Ser Leu Phe Gly He Ala Lys Glu Gin Tyr Lys Lys Tyr Ala 
325 330 335 

gaa tgg eta aaa ggg gtc tga 1029 
Glu Trp Leu Lys Gly Val 
340 



<210> 69 
<211> 342 
<212> PRT 

<213> Artificial Sequence 
<223> Description of Artificial 
coding for fusion protein 

<400> 69 

Met Pro Lys Lys Lys Arg Lys Val 
1 ' " 5 

Tyr Gly Asp Tyr He Leu Arg Glu 

20 

Lys Leu Glu Tyr Glu Asn Gly Glu 
35 40 

Leu Ala Asp Val Val Glu Ser Tyr 
50 55 

Gly Asp Thr Pro Leu Gin Ala Asp 
65 70 

Ser Gly Ser Gly Gly Gly Lys Glu 
85 

Leu Val Ala Asn Leu Arg Gin Tyr 
100 



Sequence: DNA sequence 
NLS-Ssv 



Thr Lys 
10 

Arg Lys 
25 

Val Lys 
Leu Lys 
Pro Pro 



Gly Thr 
90 

Ala Thr 
105 



Asp Lys Thr 

Gly Arg Tyr 

Glu Arg Tyr 
45 

Met Lys Leu 
60 

Gly Phe Glu 
75 

Glu Arg Arg 
Asp Gly Asn 



Arg Tyr Lys 
15 

Tyr Val Tyr 
30 

Val Gly Pro 
Gly Val Val 



Phe Tyr Asn Tyr Leu Met Asn Glu Arg Gly 
115 120 



lie Ser Glu 
125 



Pro Gly Thr 
80 

Lys lie Ala 
95 

lie Lys Ala 
110 

Lys Thr Ala 
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Lys Asp Tyr He Asn Ala lie Ser Lys Pro Tyr Lys Glu Thr Arg Asp 
130 135 140 

Ala Gin Lys Ala Tyr Arg Leu Phe Ala Arg Phe Leu Ala Ser Arg Asn 
145 150 " 155 160 

He He His Asp Glu Phe Ala Asp Lys He Leu Lys Ala Val Lys Val 
165 170 175 

Lys Lys Ala Asn Ala Asp He Tyr lie Pro Thr Leu Glu Glu He Lys 
180 185 190 

Arg Thr Leu Gin Leu Ala Lys Asp Tyr Ser Glu Asn Val Tyr Phe He 
195 200 205 

Tyr Arg He Ala Leu Glu Ser Gly Val Arg Leu Ser Glu He Leu Lys 
210 215 220 

Val Leu Lys Glu Pro Glu Arg Asp He Cys Gly Asn Asp Val Cys Tyr 
225 230 235 240 

Tyr Pro Leu Ser Trp Thr Arg Gly Tyr Lys Gly Val Phe Tyr Val Phe 
245 250. 255 

His He Thr Pro Leu Lys Arg Val Glu Val Thr Lys Trp Ala He Ala 
260 265 270 

Asp Phe Glu Arg Arg His Lys Asp Ala He Ala lie Lys Tyr Phe Arg 
275 280 285 

Lys Phe Val Ala Ser Lys Met Ala Glu Leu Ser Val Pro Leu Asp He 
290 295 300 

He Asp Phe He Gin Gly Arg Lys Pro Thr Arg Val Leu Thr Gin His 
305 310 315 320 

Tyr Val Ser Leu Phe Gly He Ala Lys Glu Gin Tyr Lys Lys Tyr Ala 
325 330 335 

Glu Trp Leu Lys Gly Val 
340 



<210> 70 
<211> 3908 
<212> DISfA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
PBS-SSV3 



vector 



<400> 70 

cacctaaatt 

ctcatttttt 

cgagataggg 

ctccaacgtc 

accctaatca 

gagcccccga 

gaaagcgaaa 

caccacaccc 

gcgcaactgt 

agggggatgt 

ttgtaaaacg 

ccgcggtggc 



gtaagcgtta 
aaccaatagg 
ttgagtgttg 
aaagggcgaa 
agttttttgg 
tttagagctt 
ggagcgggcg 
gccgcgctta 
tgggaagggc 
gctgcaaggc 
acggccagtg 
ggccgcccga 



atattttgtt 
ccgaaatcgg 
ttccagtttg 
aaaccgtcta 
ggtcgaggtg 
gacggggaaa 
ctagggcgct 
atgcgccgct 
gatcggtgcg 
gattaagttg 
aattgtaata 
tatgacgaaa 



aaaafctcgcg 
caaaatccct 
gaacaagagt 
tcagggcgat 
ccgtaaagca 
gccggcgaac. 
ggcaagtgta 
acagggcgcg 
ggcctcttcg 
ggtaacgcca 
cgactcacta 
gataagacgc 



ttaaattttt 
tataaatcaa 
ccactattaa 
ggcccactac 
ctaaatcgga 
gtggcgagaa 
gcggtcacfjc 
tcccattcgc 
ctattacgcc 
gggttttccc 
tagggcgaat 
gttataaata 



gttaaatcag 60 
aagaatagac 120 
agaacgtgga 180 
gtgaaccatc 240 
accctaaagg 300 
aggaagggaa 360 
tgcgcgtaac 420 
cattcaggct 480 
agctggcgaa 540 
agtcacgacg 600 
tggagctcca 660 
cggggattat 720 
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attttacgcg agaggaaagg gcggtattat gtttacaagc tagagtatga aaacggtgag 780 
gtaaaagagc gttacgtggg tcctttagct gacgtcgttg aatcatatct aaaaatgaaa 840 
ttaggggtcg taggggatac tcccctacaa gcggatcccc ccggtttcga gcccgggaca 900 
agcggaagcg gtggtggaaa agagggaact gaacgacgta aaatagcgtt ggttgccaat 960 
ttgcgccaat acgcgacgga cggcaacata aaggcgttct acaactatct catgaacgaa 1020 
agggggataa gcgaaaaaac tgcaaaggac tacatcaatg ctatatcaaa gccgtataaa 1080 
gagacgagag acgcacagaa ggcttaccga ctctttgcac gtttcttagc gtcacgcaat 114 0 
atcatacatg atgaatttgc ggataaaata ttgaaagcgg taaaggtgaa gaaggcgaac 1200 
gctgatatct acattccaac gttggaagag ataaaaagga cgttacaatt agcaaaagac 1260 
tatagcgaaa acgtctactt catctaccgt atcgctctcg agtcgggcgt taggctgagc 1320 
gaaatactga aagtgctgaa ggaacccgaa agggacattt gcggtaacga cgtctgttat 1380 
tatccgctta gttggactag gggatataag ggcgtcttct atgtattcca cataacgcct 1440 
ctgaagagag tagaggtgac gaagtgggca atagcggact ttgaacgacg tcataaggac 1500 
gctatagcga taaagtactt ccgcaaattc gtagcgtcta agatggctga gctaagcgta 1560 
ccgttagata ttatcgattt tattcaaggg cgtaaaccga cacgcgtttt aacgcaacat 1620 
tacgtatcgc tcttcggcat agcgaaagag caatataaaa agtatgcgga atggctaaaa 1680 
ggggtctgac tcgagggggg gcccggtacc cagcttttgt tccctttagt gagggttaat 1740 
ttcgagcttg gcgtaatcat ggtcatagct gtttcctgtg tgaaattgtt atccgatcac 1800 
aattccacac aacatacgag ccggaagcat aaagtgtaaa gcctggggtg cctaatgagt 1860 
gagctaactc acattaattg cgttgcgctc actgcccgct ttccagtcgg gaaacctgtc 1920 
gtgccagctg cattaatgaa tcggccaacg cgcggggaga ggcggtttgc gtattgggcg 1980 
ctcttccgct tcctcgctca ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt 2040 
atcagctcac tcaaaggcgg taatacggtt atccacagaa tcaggggata acgcaggaaa 2100 
gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc 2160 
gttfcttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct caagtcagag 2220 
gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa gctccctcgt 2280 
gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc tcccttcggg 2340 
aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt aggtcgttcg 2400 
ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg ccttatccgg 24 60 
taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg cagcagccac 2520 
tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct tgaagtggtg 2580 
gcctaactac ggctacacta gaaggacagt atttggtatc tgcgctctgc tgaagccagt 2 640 
taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg ctggtagcgg 2700 
tggttttttt gtttgcaagc agcagattac . gcgcagaaaa aaaggatctc aagaagatcc 27 60 
tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt aagggatttt 2820 
ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa aatgaagttt 2880 
taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat gcttaatcag 294 0 
tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct gactccccgt 3000 
cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg caatgatacc 3060 
gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag ccggaagggc 3120 
cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta attgttgccg 3180 
ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccattgctac 3240 
aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg gttcccaacg 3300 
atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct ccttcggtcc 3360 
tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta tggcagcact 3420 
gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg gtgagtactc 3480 
aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat 3540 
acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg gaaaacgfctc 3600 
ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga tgtaacccac 3660 
tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg ggtgagcaaa 3720 
aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat gttgaatact 3780 
catactcttc ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg 3840 
atacatattt gaatgtattt agaaaaataa acaaataggg gttccgcgca catttccccg 3900 
aaaagtgc , 3908 



<210> 71 
<211> 3927 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pBS-SSV4 

<400> 71 

cacctaaatt gtaagcgtta atattttgtt aaaattcgcg ttaaattttt gttaaatcag 60 
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ctcatttttt aaccaatagg ccgaaatcgg caaaatccct tataaatcaa aagaatagac 120 
cgagataggg ttgagtgttg ttccagtttg gaacaagagt ccactattaa agaacgtgga 180 
ctccaacgtc aaagggcgaa aaaccgtcta tcagggcgat ggcccactac gtgaaccatc 240 
accctaatca agttttttgg ggtcgaggtg ccgtaaagca ctaaatcgga accctaaagg 300 
5 gagcccccga tttagagctt gacggggaaa gccggcgaac gtggcgagaa aggaagggaa 360 
gaaagcgaaa ggagcgggcg ctagggcgct ggcaagtgta gcggtcacgc tgcgcgtaac 420 
caccacaccc gccgcgctta atgcgccgct acagggcgcg tcccattcgc cattcaggct 480 
gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc agctggcgaa 540 
agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc agtcacgacg 600 

10 ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat tggagctcca 660 
ccgcggtggc ggccgcacca tgcccaagaa gaagaggaag gtgacgaaag ataagacgcg 720 
ttataaatac ggggattata ttttacgcga gaggaaaggg cggtattatg tttacaagct 780 
agagtatgaa aacggtgagg taaaagagcg ttacgtgggt cctttagctg acgtcgttga 840 
atcatatcta aaaatgaaat taggggtcgt aggggatact cccctacaag cggatccccc 900 

15 cggtttcgag cccgggacaa gcggaagcgg tggtggaaaa gagggaactg aacgacgtaa 960 
aatagcgttg gttgccaatt tgcgccaata cgcgacggac ggcaacataa aggcgttcta 1020 
caactatctc atgaacgaaa gggggataag cgaaaaaact gcaaaggact acatcaatgc 1080 
tatatcaaag ccgtataaag agacgagaga cgcacagaag gcttaccgac tctttgcacg 114 0 
tttcttagcg tcacgcaata tcatacatga tgaatttgcg gataaaatat tgaaagcggt 1200 

20 aaaggtgaag aaggcgaacg ctgatatcta cattccaacg ttggaagaga taaaaaggac 1260 
gttacaatta gcaaaagact atagcgaaaa cgtctacttc atctaccgta tcgctctcga 1320 
gtcgggcgtt aggctgagcg aaatactgaa agtgctgaag gaacccgaaa gggacatttg 1380 
cggtaacgac gtctgttatt atccgcttag ttggactagg ggatataagg gcgtcttcta 1440 
tgtattccac ataacgcctc tgaagagagt agaggtgacg aagtgggcaa tagcggactt 1500 

25 tgaacgacgt cataaggacg ctatagcgat aaagtacttc cgcaaattcg tagcgtctaa 1560 
gatggctgag ctaagcgtac cgttagatat tatcgatttt attcaagggc gtaaaccgac 1620 
acgcgtttta acgcaacatt acgtatcgct cttcggcata gcgaaagagc aatataaaaa 1680 
gtatgcggaa tggctaaaag gggtctgact cgaggggggg cccggtaccc agcttttgtt 1740 
ccctttagtg agggttaatt tcgagcttgg cgtaatcatg gtcatagctg tttcctgtgt 1800 

30 gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag 1860 
cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt 1920 
tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 1980 
gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg , cgctcggtcg 204 0 
ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 2100 

35 caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 2160 
aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 2220 
atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 228 0 
cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 2340 
ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 2400 

40 gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 2460 
accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 2520 
cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 2580 
cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 2640 
gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 2700 

45 aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 2760 
aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 2820 
actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 2880 
taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 2940 
gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 3000 

50 tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 3060 
ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 3120 
accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 3180 
agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca .3240 
acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 3300 

55 tcagctecgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 3360 
cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac 3420 
tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 3480 
ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt 3540 
gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc 3600 

60 tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat 3660 
ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt acttt caeca 3720 
gcgtttctgg gtgagcaaaa acaggaaggc aaaatgeege aaaaaaggga ataagggega 3780 
caeggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg 38 4 0 
gttattgtct catgagegga tacatatttg aatgtattta gaaaaataaa caaatagggg 3900 

65 ttccgcgcac atttccccga aaagtgc 3927 
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<210> 72 
<211> 3351 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pBS-SSVs 

10 <400> 72 

cgacctcgag tcagacccct tttagccatt ccgcatactt tttatattgc tctttcgcta 60 
tgccgaagag cgatacgtaa tgttgcgtta aaacgcgtgt cggtttacgc ccttgaataa 120 
aatcgataat atctaacggt acgcttagct cagccatctt agacgctacg aatttgcgga 180 
agtactttat cgctatagcg tccttatgac gtcgttcaaa gtccgctatt gcccacttcg 240 

IS tcacctctac tctcttcaga ggcgttatgt ggaatacata gaagacgccc ttatatcccc 300 
tagtccaact aagcggataa taacagacgt cgttaccgca aatgtccctt tcgggttcct 360 
tcagcacttt cagtatttcg ctcagcctaa cgcccgactc gagggggggc ccggtaccca 420 
gcttttgttc cctttagtga gggttaattt cgagcttggc gtaatcatgg tcatagctgt 480 
ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa catacgagcc ggaagcataa 54 0 

20 agtgtaaagc ctggggtgcc taatgagtga gctaactcac attaattgcg ttgcgctcac 600 
tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg 660 
cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact gactcgctgc 720 
gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta atacggttat 780 
ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag caaaaggcca 840 

25 ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc cctgacgagc^ 900 
atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta taaagatacc 960 
aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg 1020 
gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta 1080 
ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg 114 0 

30 ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac ccggtaagac 1200 
acgaottatc gccactggca gcagccactg gtaacaggat tagcagagcg aggtatgtag 1260 
gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga aggacagtat 1320 
ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt agctcttgat 1380 
ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag cagattacgc 14 40 

35 gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct gacgctcagt 1500 
ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg atcttcacct 1560 
agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat gagtaaactt 1620 
ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc tgtctatttc 1680 
gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg gagggcttac 1740 

40 catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct ccagatttat 1900 
cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca actttatccg - 18 60 
cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg ccagttaata 1920 
gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg tcgtttggta 1980 
tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc cccatgttgt 2040 

45 gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag ttggccgcag 2100 
tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg ccatccgtaa 2160 
gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag tgtatgcggc 2220 
gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgccacat agcagaactt 2280 
taaaagtgct catcattgga aaacgttctt cggggcgaaa actctcaagg atcttaccgc 234 0 

50 tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta 2400 
ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa 2460 
taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaatat tattgaagca 2520 
tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag aaaaataaac 2580 
aaataggggt tccgcgcaca tttccccgaa aagtgccacc taaattgtaa gcgttaatat 2640 

55 tttgttaaaa ttcgcgttaa atttttgtta aatcagctca ttttttaacc aataggccga 2700 
aatcggcaaa atcccttata aatcaaaaga atagaccgag atagggttga gtgttgttcc 27 60 
agtttggaac aagagtccac tattaaagaa cgtggactcc aacgtcaaag ggcgaaaaac 2820 
cgtctatcag ggcgatggcc cactacgtga accatcaccc taatcaagtt ttttggggtc 2880 
gaggtgccgt aaagcactaa atcggaaccc taaagggagc ccccgattta gagcttgacg 2940 

60 gggaaagccg gcgaacgtgg cgagaaagga agggaagaaa gcgaaaggag cgggcgctag 3000 
ggcgctggca agtgtagcgg tcacgctgcg cgtaaccacc acacccgccg cgcttaatgc 3060 
gccgctacag ggcgcgtccc attcgccatt caggctgcgc aactgttggg aagggcgatc 3120 
ggtgcgggcc tcttcgctat tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt 3180 
aagttgggta acgccagggt tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt 3240 

65 gtaatacgac tcactatagg gcgaattgga gctccaccgc ggtggcggcc gctctagaac 3300 
tagtggatcc cccgggctgc aggaattcga tatcaagctt atcgataccg t 3351 
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<210> 73 
<211> 5730 
<212> DNA 
5 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
pCMVC31{WKLS) 

<400> 73 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 

tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 

cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 

IS atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 

tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 

cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 

tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 

cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 

20 ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 

aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 

gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 

attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 

tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 7 80 

25 cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 

ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 

gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 

cagggcggcc gcaccatgcc caagaagaag aggaaggtga cacaaggggt tgtgaccggg 1020 

gtggacacgt acgcgggtgc ttacgaccgt cagtcgcgcg agcgcgagaa ttcgagcgca 1080 

30 gcaagcccag cgacacagcg tagcgccaac gaagacaagg cggccgacct tcagcgcgaa 1140 

gtcgagcgcg acgggggccg gttcaggttc gtcgggcatt tcagcgaagc gccgggcacg 1200 

tcggcgttcg ggacggcgga gcgcccggag ttcgaacgca tcctgaacga atgccgcgcc 1260 

gggcggctca acatgatcat tgtctatgac gtgtcgcgct tctcgcgcct gaaggtcatg 1320 

gacgcgattc cgattgtctc ggaattgctc gccctgggcg tgacgattgt ttccactcag 1380 

35 gaaggcgtct tccggcaggg aaacgtcatg gacctgattc acctgattat gcggctcgac 1440 

gcgtcgcaca aagaatcttc gctgaagtcg gcgaagattc tcgacacgaa gaaccttcag 1500 

cgcgaattgg gcgggtacgt cggcgggaag gcgccttacg gcttcgagct tgtttcggag 1560 

acgaaggaga tcacgcgcaa cggccgaatg gtcaatgtcg tcatcaacaa gcttgcgcac 1620 

tcgaccactc cccttaccgg acccttcgag ttcgagcccg acgtaatccg gtggtggtgg 1680 

40 cgtgagatca agacgcacaa acaccttccc ttcaagccgg gcagtcaagc cgccattcac 1740 

ccgggcagca tcacggggct ttgtaagcgc atggacgctg acgccgtgcc gacccggggc 1800 

gagacgattg ggaagaagac cgcttcaagc gcctgggacc cggcaaccgt tatgcgaatc I8 60 

cttcgggacc cgcgtattgc gggcttcgcc gctgaggtga tctacaagaa gaagccggac 1920 

ggcacgccga ccacgaagat tgagggttac cgcattcagc gcgacccgat cacgctccgg 1980 

45 ccggtcgagc ttgattgcgg accgatcatc gagcccgctg agtggtatga gcttcaggcg 2040 

tggttggacg gcagggggcg cggcaagggg ctttcccggg ggcaagccat tctgtccgcc 2100 

atggacaagc tgtactgcga gtgtggcgcc gtcatgactt cgaagcgcgg ggaagaatcg 2160 

atcaaggact cttaccgctg ccgtcgccgg aaggtggtcg acccgtccgc acctgggcag 2220 

cacgaaggca cgtgcaacgt cagcatggcg gcactcgaca agttcgttgc ggaacgcatc 2280 

50 ttcaacaaga tcaggcacgc cgaaggcgac gaagagacgt tggcgcttct gtgggaagcc 2340 

gcccgacgct tcggcaagct cactgaggcg cctgagaaga gcggcgaacg ggcgaacctt 2400 

gttgcggagc gcgccgacgc cctgaacgcc cttgaagagc tgtacgaaga ccgcgcggca 2460 

ggcgcgtacg acggacccgt tggcaggaag cacttccgga agcaacaggc agcgctgacg 2520 

ctccggcagc aaggggcgga agagcggctt gccgaacttg aagccgccga agccccgaag 2580 

55 cttccccttg accaatggtt ccccgaagac gccgacgctg acccgaccgg ccctaagtcg 2 640 

tggfcgggggc gcgcgtcagt agacgacaag cgcgtgttcg tcgggctctt cgtagacaag 2700 

atcgttgtca cgaagtcgac tacgggcagg gggcagggaa cgcccatcga gaagcgcgct 27 60 

tcgatcacgt gggcgaagcc gccgaccgac gacgacgaag acgacgccca ggacggcacg 2820 

gaagacgtag cggcgtaggc ggcgcccggg ctcgagatcc aggcgcggat caataaaaga 2880 

60 tcattatttt caatagatct gtgtgttggt tttttgtgtg ccttggggga gggggaggcc 2940 

agaatgaggc gcggccaagg gggaggggga ggccagaatg accttggggg agggggaggc 3000 

cagaatgacc ttgggggagg gggaggccag aatgaggcgc gcccccgggt accgagctcg 3060 

aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt. tacccaactt 3120 

aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga ggcccgcacc 3180 

65 gatcgccctt cccaacagtt gcgcagcctg aatggcgaat ggcgcctgat; gcggtatttt 3240 

ctccttacgc atctgtgcgg tatttcacac cgcatatggt gcactctcag tacaatctgc 3300 

tctgatgccg catagttaag ccagccccga cacccgccaa cacccgctga cgcgccctga 3360 
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cgggcttgtc 
atgtgtcaga 
cgcctatttt 
tttcggggaa 
tatccgctca 
atgagtattc 
gtttttgctc 
cgagtgggtt 
gaagaacgtt 
cgtattgacg 
gttgagtact 
tgcagtgctg 
ggaggaccga 
gatcgttggg 
cctgtagcaa 
tcccggcaac 
tcggcccttc 
cgcggtatca 
acgacgggga 
tcactgatta 
ttaaaacttc 
accaaaatcc 
aaaggatctt 
ccaccgctac 
gtaactggct 
ggccaccact 
ccagtggctg 
ttaccggata 
gag'cgaacga 
cttcccgaag 
cgcacgaggg 
cacctctgac 
aacgccagca 
ttctttcctg 
gataccgctc 
gagcgcccaa 
cacgacaggt 
ctcactcatt 
attgtgagcg 
gcccgggcta 



tgctcccggc 
ggttttcacc 
tataggttaa 
atgtgcgcgg 
tgagacaata 
aacatttccg 
acccagaaac 
acatcgaact 
ttccaatgat 
ccgggcaaga 
caccagtcac 
ccataaccat 
aggagctaac 
aaccggagct 
tggcaacaac 
aattaataga 
cggctggctg 
ttgcagcact 
gtcaggcaac 
agcattggta 
atttttaatt 
cttaacgtga 
cttgagatcc 
cagcggtggt 
tcagcagagc 
tcaagaactc 
ctgccagtgg 
aggcgcagcg 
cctacaccga 
ggagaaaggc 
agcttccagg 
ttgagcgtcg 
acgcggcctt 
cgttatcccc 
gccgcagccg 
tacgcaaacc 
ttcccgactg 
aggcacccca 
gataacaatt 
gcttgcatgc 



atccgcttac 
gtcatcaccg 
tgtcatgata 
aacccctatt 
accctgataa 
tgtcgccctt 
gctggtgaaa 
ggatctcaac 
gagcactttt 
gcaactcggt 
agaaaagcat 
gagtgataac 
cgcttttttg 
gaatgaagcc 
gttgcgcaaa 
ctggatggag 
gtttattgct 
ggggccagat 
tatggatgaa 
actgtcagac 
taaaaggatc 
gttttcgttc 
tttttttctg 
ttgtttgccg 
gcagatacca 
tgtagcaccg 
cgataagtcg 
gtcgggctga 
actgagatac 
ggacaggtat 
gggaaacgcc 
atttttgtga 
tttacggttc 
tgattctgtg 
aacgaccgag 
gcctctcccc 
gaaagcgggc 
ggctttacac 
tcacacagga 
ctgcaggttt 



62 

agacaagctg 
aaacgcgcga 
ataatggttt 
tgtttatttt 
atgcttcaat 
attccctttt 
gtaaaagatg 
agcggtaaga 
aaagttctgc 
cgccgcatac 
cttacggatg 
actgcggcca 
cacaacatgg 
ataccaaacg 
ctattaactg 
gcggataaag 
gataaatctg 
ggtaagccct 
cgaaatagac 
caagtttact 
taggtgaaga 
cactgagcgt 
cgcgtaatct 
gatcaagagc 
aatactgtcc 
cctacatacc 
tgtcttaccg 
acggggggtt 
ctacagcgtg 
ccggtaagcg 
tggtatcttt 
tgctcgtcag 
ctggcctttt 
gataaccgta 
cgcagcgagt 
gcgcgttggc 
agtgagcgca 
tttatgcttc 
aacagctatg 



tgaccgtctc 
gacgaaaggg 
cttagacgtc 
tctaaataca 
aatattgaaa 
ttgcggcatt 
ctgaagatca 
tccttgagag 
tatgtggcgc 
actattctca 
gcatgacagt 
acttacttct 
gggatcatgt 
acgagcgtga 
gcgaactact 
ttgcaggacc 
gagccggtga 
cccgtatcgt 
agatcgctga 
catatatact 
tcctttttga 
cagaccccgt 
gctgcttgca 
taccaactct 
ttctagtgta 
tcgctctgct 
ggttggactc 
cgtgcacaca 
agctatgaga 
geagggtegg 
atagtcctgt 
gggggeggag 
gctggccttt 
ttaccgcctt 
cagtgagega 
cgattcatta 
aegcaattaa 
eggctegtat 
accatgatta 



egggagctge 
ectegtgata 
aggtggcact 
ttcaaatatg 
aaggaagagt 
ttgccttcct 
gttgggtgca 
ttttcgcccc 
ggtattatcc 
gaatgacttg 
aagagaatta 
gacaacgatc 
aactcgcctt 
caccacgatg 
tactctagct 
acttctgcgc 
gcgtgggtct 
agttatctac 
gataggtgee 
ttagattgat 
taatctcatg 
agaaaagatc 
aacaaaaaaa 
ttttccgaag 
geegtagtta 
aatcctgtta 
aagacgatag 
gcccagcttg 
aagcgccacg 
aacaggagag 
egggtttege 
cctatggaaa 
tgctcacatg 
tgagtgagct 
ggaagcggaa 
atgcagctgg 
tgtgagttag 
gttgtgtgga 
cgccaagcta 



3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5730 



<210> 74 
<211> 4886 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
pCMV-SSV 



<400> 74 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 

tagtaatcaa ttacggggtc attagttcat ageccatata tggagttccg cgttacataa 120 

ettaeggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gaegtcaata 180 

atgacgtatg ttcccatagt aacgecaata gggactttcc attgaegtea atgggtggac 24 0 

tatttaeggt aaactgccca cttggcagta catcaagtgt ateatatgee aagtacgccc 300 

cctattgacg teaatgaegg taaatggccc gectggcatt atgcccagta catgacctta 360 

tgggactttc ctacttggca gtacatctac gtattagtca tegctattae catggtgatg 420 

cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcaegggg atttccaagt 480 

ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 

aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 

gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 

attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 

tgagtactcc ctctcaaaag egggcatgae ttctgegcta agattgtcag tttccaaaaa 780 

cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 

ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
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gccatacact tgagtgacat tgacatccac 
cagggcggcc gcccgatatg acgaaagata 
tacgcgagag gaaagggcgg tattatgttt 
aagagcgtta cgtgggtcct ttagctgacg 
5 gggtcgtagg ggatactccc ctacaagcgg 
gaagcggtgg tggaaaagag ggaactgaac 
gccaatacgc gacggacggti aacataaagg 
ggataagcga aaaaactgca aaggactaca 
cgagagacgc acagaaggct taccgactct 

10 tacatgatga atttgcggat aaaatattga 
atatctacat tccaacgttg gaagagataa 
gcgaaaacgt ctacttcatc taccgtatcg 
tactgaaagt gctgaaggaa cccgaaaggg 
cgcttagttg gactagggga tataagggcg 

15 agagagtaga ggtgacgaag tgggcaatag 
tagcgataaa gtacttccgc aaattcgtag 
tagatattat cgattttatt caagggcgta 
tatcgctctt cggcatagcg aaagagcaat 
tctgactcga gggggggccc gtcgacctcg 

20 tattttcaat agatctgtgt gttggttttt 
tgaggcgcgg ccaaggggga gggggaggcc 
atgaccttgg gggaggggga ggccagaatg 
cactggccgt cgttttacaa cgtcgtgact 
gccttgcagc acatccccct ttcgccagct 

25 gcccttccca acagttgcgc agcctgaatg 
ttacgcatct gtgcggtatt tcacaccgca 
atgccgcata gttaagccag ccccgacacc 
cttgtctgct cccggcatcc gcttacagac 
gtcagaggtt ttcaccgtca tcaccgaaac 

30 tatttttata ggttaatgtc atgataataa 
ggggaaatgt gcgcggaacc cctatttgtt 
cgctcatgag acaataaccc tgataaatgc 
gtattcaaca tttccgtgtc gcccttattc 
ttgctcaccc agaaacgctg gtgaaagtaa 

35 tgggttacat cgaactggat ctcaacagcg 
aacgttttcc aatgatgagc acttttaaag 
ttgacgccgg gcaagagcaa ctcggtcgcc 
agtactcacc agtcacagaa aagcatctta 
gtgctgccat aaccatgagt gataacactg 

40 gaccgaagga gctaaccgct tttttgcaca 
gttgggaacc ggagctgaat gaagccatac 
tagcaatggc aacaacgttg cgcaaactat 
ggcaacaatt aatagactgg atggaggcgg 
cccfctccggc tggctggttt attgctgata 

45 gtatcattgc agcactgggg ccagatggta 
cggggagtca ggcaactatg gatgaacgaa 
tgattaagca ttggtaactg tcagaccaag 
aacttcattt ttaatttaaa aggatctagg 
aaatccctta acgtgagttt tcgttccact 

50 gatcttcttg agatcctttt tttctgcgcg 
cgctaccagc ggtggtttgt ttgccggatc 
ctggcttcag cagagcgcag ataccaaata 
accacttcaa gaactctgta gcaccgccta 
tggctgctgc cagtggcgat aagtcgtgtc 

55 cggataaggc gcagcggtcg ggctgaacgg 
gaacgaccta caccgaactg agatacctac 
ccgaagggag aaaggcggac aggtatccgg 
cgagggagct tccaggggga aacgcctggt 
tctgacttga gcgtcgattt ttgtgatgct 

60 ccagcaacgc ggccttttta cggttcctgg 
ttcctgcgtt atcccctgat tctgtggata 
ccgctcgccg cagccgaacg accgagcgca 
gcccaatacg caaaccgcct ctccccgcgc 
acaggtttcc cgactggaaa gcgggcagtg 

65 ctcattaggc accccaggct ttacacttta 
tgagcggata acaatttcac acaggaaaca 
gggctagctt gcatgcctgc aggttt 
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tttgcctttc tctccacagg tgtccactcc 960 
agacgcgtta taaatacggg gattatattt 1020 
acaagctaga gtatgaaaac ggtgaggtaa 1080 
tcgttgaatc atatctaaaa atgaaattag 1140 
atccccccgg tttcgagccc gggacaagcg 1200 
gacgtaaaat agcgttggtt gccaatttgc 1260 
cgttctacaa ctatctcatg aacgaaaggg 1320 
tcaatgctat atcaaagccg tataaagaga 1380 
ttgcacgttt cttagcgtca cgcaatatca 14 40 
aagcggtaaa ggtgaagaag gcgaacgctg 1500 
aaaggacgtt acaattagca aaagactata 1560 
ctctcgagtc gggcgttagg ctgagcgaaa 1620 
acatttgcgg taacgacgtc tgttattatc 1680 
tcttctatgt attccacata acgcctctga 1740 
cggactttga acgacgtcat aaggacgcta 1800 
cgtctaagat ggctgagcta agcgtaccgt 19 60 
aaccgacacg cgttttaacg caacattacg 1920 
ataaaaagta tgcggaatgg ctaaaagggg 1980 
agatccaggc gcggatcaat aaaagatcat 2040 
tgtgtgcctt gggggagggg gaggccagaa 2100 
agaatgacct tgggggaggg ggaggccaga 2160 
aggcgcgccc ccgggtaccg agctcgaatt 2220 
gggaaaaccc tggcgttacc caacttaatc 2280 
ggcgtaatag cgaagaggcc cgcaccgatc 234 0 
gcgaatggcg cctgatgcgg tattttctcc 24 00 
tatggtgcac tctcagtaca atctgctctg 2460 
cgccaacacc cgctgacgcg ccctgacggg 2520 
aagctgtgac cgtctccggg agctgcatgt 2580 
gcgcgagacg aaagggcctc gtgatacgcc 2640 
tggtttctta gacgtcaggt ggcacttttc 2700 
tatttttcta aatacattca aatatgtatc 2760 
ttcaataata ttgaaaaagg aagagtatga 2820 
ccttttttgc ggcattttgc cttcctgttt 2880 
aagatgctga agatcagttg ggtgcacgag 294 0 
gtaagatcct tgagagtttt cgccccgaag 3000 
ttctgctatg tggcgcggta ttatcccgta 3060 
gcatacacta ttctcagaat gacttggttg 3120 
cggatggcat gacagtaaga gaattatgca 3180 
cggccaactt acttctgaca acgatcggag 3240 
acatggggga tcatgtaact cgccttgatc 3300 
caaacgacga gcgtgacacc acgatgcctg 3360 
taactggcga actacttact ctagcttccc 3420 
ataaagttgc aggaccacfct ctgcgctcgg 34 80 
aatctggagc cggtgagcgt gggtctcgcg 3540 
agccctcccg tatcgtagtt atctacacga 3600 
atagacagat cgctgagata ggtgcctcac 3660 
tttactcata tatactttag attgatttaa 3720 
tgaagatcct ttttgataat ctcatgacca 3780 
gagcgtcaga . ccccgtagaa aagatcaaag 3840 
taatctgctg cttgcaaaca aaaaaaccac 3900 
aagagctacc aactcttttt ccgaaggtaa 3960 
ctgtccttct agtgtagccg tagttaggcc 4020 
catacctcgc tctgctaatc ctgttaccag 4080 
ttaccgggtt ggactcaaga cgatagttac 4140 
ggggttcgtg cacacagccc agcttggagc 4200 
agcgtgagct atgagaaagc gccacgcttc 4260 
taagcggcag ggtcggaaca ggagagcgca 4320 
atctttatag tcctgtcggg tttcgccacc 4380 
cgtcaggggg gcggagccta tggaaaaacg 4440 
ccttttgctg gccttttgct cacatgttct 4500 
accgtattac cgcctttgag tgagctgata 4560 
gcgagtcagt gagcgaggaa gcggaagagc 4 620 
gttggccgat tcattaatg.c agctggcacg 4680 
agcgcaacgc aattaatgtg agttagctca 47 40 
tgcttccggc tcgtatgttg tgtggaattg 4 800 
gctatgacca tgattacgcc aagctagccc 4560 
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<210> 75 
<211> 4905 
5 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
10 pCMV-SSV(NNLS) 

<400> 75 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 

15 cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 
atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 240 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 

20 cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 
ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 

25 tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 
cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 
cagggcggcc gcaccatgcc caagaagaag aggaaggtga cgaaagataa gacgcgttat 1020 

30 aaatacgggg attatatttt acgcgagagg aaagggcggt attatgttta caagctagag 1080 
tatgaaaacg gtgaggtaaa agagcgttac gtgggtcctt tagctgacgt cgttgaatca 1140 
tatctaaaaa tgaaattagg ggtcgtaggg gatactcccc tacaagcgga tccccccggt 1200 
ttcgagcccg ggacaagcgg aagcggtggt ggaaaagagg gaactgaacg acgtaaaata 1260 
gcgttggttg ccaatttgcg ccaatacgcg acggacggca acataaaggc gttctacaac 1320 

35 tatctcatga acgaaagggg gataagcgaa aaaactgcaa aggactacat caatgctata 1380 
tcaaagccgt ataaagagac gagagacgca cagaaggctt accgactctt tgcacgtttc 1440 
ttagcgtcac gcaatatcat acatgatgaa tttgcggata aaatattgaa agcggtaaag 1500 
gtgaagaagg cgaacgctga tatctacatt ccaacgttgg. aagagataaa aaggacgtta 1560 
caattagcaa aagactatag cgaaaacgtc tacttcatct accgtatcgc tctcgagtcg 1620 

40 ggcgttaggc tgagcgaaat actgaaagtg ctgaaggaac ccgaaaggga catttgcggt 1680 
aacgacgtct gttattat cc gcttagttgg actaggggat ataagggcgt cttctatgta 1740 
ttccacataa cgcctctgaa gagagtagag gtgacgaagt gggcaatagc ggactttgaa 1800 
cgacgtcata aggacgctat agcgataaag tacttccgca aattcgtagc gtctaagatg 1860 
gctgagctaa gcgtaccgtt agatattatc gattttattc aagggcgtaa accgacacgc 1920 

45 gttttaacgc aacattacgt atcgctcttc ggcatagcga aagagcaata taaaaagtat 1980 
gcggaatggc taaaaggggt ctgactcgag ggggggcccg tcgacctcga gatccaggcg 2040 
cggatcaata aaagatcatt attttcaata gatctgtgtg ttggtttttt gtgtgccttg 2100 
ggggaggggg aggccagaat gaggcgcggc caagggggag ggggaggcca gaatgacctt 2160 
gggggagggg gaggccagaa tgaccttggg ggagggggag gccagaatga ggcgcgcccc 2220 

50 cgggtaccga gctcgaattc actggccgtc gttttacaac gtcgtgactg ggaaaaccct 2280 
ggcgttaccc aacttaatcg ccttgcagca catccccctt tcgccagctg gcgtaatagc 2340 
gaagaggccc gcaccgatcg cccttcccaa cagttgcgca gcctgaa-fcgg cgaatggcgc 2400 
ctgatgcggt attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact 2460 
ctcagtacaa tctgctctga tgccgcatag ttaagccagc cccgacaccc gccaacaccc 2520 

55 gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 2580 
gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgagacga 2 640 
aagggcctcg tgatacgcct atttttatag gttaatgtca tgataataat ggtttcttag 2700 
acgtcaggtg gcacttttcg gggaaatgtg cgcggaaccc ctatttgttt atttttctaa 27 60 
atacattcaa atatgtatcc gctcatgaga caataaccct gataaatgct tcaataatat 2820 

60 tgaaaaagga agagtatgag tattcaacat ttccgtgtcg cccttattcc cttttttgcg 2880 
gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg tgaaagtaaa agatgctgaa 2940 
gatcagttgg gtgcacgagt gggttacatc gaactggatc tcaacagcgg taagatcctt 3000 
gagagttttc gccccgaaga acgttttcca atgatgagca cttttaaagt tctgctatgt 3060 
ggcgcggtat tatcccgtat tgacgccggg caagagcaac tcggtcgccg catacactat 3120 

65 tctcagaatg acttggttga gt act caeca gtcacagaaa agcatcttac ggatggcatg 3180 
acagtaagag aattatgcag tgetgecata accatgagtg ataacactgc ggecaactta 3240 
cttctgacaa egateggagg accgaaggag ctaaccgctt ttttgcacaa catgggggat 3300 
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catgtaactc gccttgatcg ttgggaaccg 
cgtgacacca cgatgcctgt agcaatggca 
ctacttactc tagcttcccg gcaacaatta 
ggaccacttc tgcgctcggc ccttccggct 
5 ggtgagcgtg ggtctcgcgg tatcattgca 
atcgtagtta tctacacgac ggggagtcag 
gctgagatag gtgcctcact gattaagcat 
atactttaga ttgatttaaa acttcatttt 
tttgataatc tcatgaccaa aatcccttaa 

10 cccgtagaaa agatcaaagg atcttcttga 
ttgcaaacaa aaaaaccacc gctaccagcg 
actctttttc cgaaggtaac tggcttcagc 
gtgtagccgt agttaggcca ccacttcaag 
ctgctaatcc tgttaccagt ggctgctgcc 

15 gactcaagac gatagttacc ggataaggcg 
acacagccca gcttggagcg aacgacctac 
tgagaaagcg ccacgcttcc cgaagggaga 
gtcggaacag gagagcgcac gagggagctt 
cctgtcgggt ttcgccacct ctgacttgag 

20 cggagcctat ggaaaaacgc cagcaacgcg 
ccttttgctc acatgttctt tcctgcgtta 
gcctttgagt gagctgatac cgctcgccgc 
agcgaggaag cggaagagcg cccaatacgc 
cattaatgca gctggcacga caggtttccc 

25 attaatgtga gttagctcac fccattaggca 
cgtatgttgt gtggaattgt gagcggataa 
gattacgcca agctagcccg ggctagcttg 
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gagctgaatg aagccatacc aaacgacgag 3360 

acaacgttgc gcaaactatt aactggcgaa 3420 

atagactgga tggaggcgga taaagttgca 3480 

ggctggttta ttgctgataa atctggagcc 3540 

gcactggggc cagatggtaa gccctcccgt 3600 

gcaactatgg atgaacgaaa tagacagatc 3660 

tggtaactgt cagaccaagt ttactcatat 3720 

taatttaaaa ggatctaggt gaagatcctt 3780 

cgtgagtttt cgttccactg agcgtcagac 3840 

gatccttttt ttctgcgcgt aatctgctgc 3900 

gtggtttgtt tgccggatca agagctacca 3960 

agagcgcaga taccaaatac tgtccttcta 4020 

aactctgtag caccgcctac atacctcgct 4080 

agtggcgata agtcgtgtct taccgggttg 4140 

cagcggtcgg gctgaacggg gggttcgtgc 4200 

accgaactga gatacctaca gcgtgagcta 4260 

aaggcggaca ggtatccggt aagcggcagg 4320 

ccagggggaa acgcctggta tctttatagt 4380 

cgtcgatttt tgtgatgctc gtcagggggg 4440 

gcctttttac ggttcctggc cttttgctgg 4500 

tcccctgatt ctgtggataa -dcgtattacc 45 60 

agccgaacga ccgagcgcag cgagtcagtg 4620 

aaaccgcctc fcccccgcgcg ttggccgatt 4680 

gactggaaag cgggcagtga gcgcaacgca 4740 

ccccaggctt tacactttat gcttccggct 4800 

caatttcaca caggaaacag ctatgaccat 4860 

catgcctgca ggttt 4905 



30 <210> 76 

<211> 5290 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: vector 
pCMVXisA 

<400> 76 

40 agtccgatgt acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt 60 

aatcaattac ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta 120 

cggtaaatgg cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga 180 

cgtatgttcc catagtaacg ccaataggga ctttccattg acgtcaatgg gtggactatt 240 

tacggtaaac tgcccacttg. gcagtacatc aagtgtatca tatgccaagt acgcccccta 300 

45 ttgacgtcaa tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg 360 

actttcctac ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt 420 

tttggcagta catcaatggg cgtggatagc ggttt gactc acggggattt ccaagtctcc 480 

accccattga cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat 540 

gtcgtaacaa ctccgdccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct 600 

50 atataagcag agctctctgg ctaactagag aacccactgc ttactggctt atcgaaatta 660 

atacgactca ctatagggag acccaagctg actctagact taattaagcg ttggggtgag 720 

tactccctct caaaagcggg catgacttct gcgctaagat tgtcagtttc caaaaacgag 780 

gaggatttga tattcacctg gcccgcggtg atgcctttga gggtggccgc gtccatctgg 840 

tcagaaaaga caatcttttt gttgtcaagc ttgaggtgtg gcaggcttga gatctggcca 900 

55 tacacttgag tgacattgac atccactttg cctttctctc cacaggtgtc cactcccagg 960 

gcggccgccc gatatgcaaa atcagggtca agacaaatat caacaagcct ttgcagactt 1020 

agagccactc tcatctaccg acggcagttt tctcggctca agtctgcaag cacagcagca 1080 

aagagaacac atgagaacaa aagtactaca agacctagac aaggtaaatc tgcgtttgaa 1140 

gtctgcaaag acgaaagtct cagttcgaga atctaacgga agtctgcaat tacgagcaac 1200 

60 gttaccaatt aaacctggag ataaggacac caacggtaca ggcagaaagc aatacaatct 1260 

cagcttgaat atccctgcaa acttggatgg actgaagacg gctgaggaag aagcttatga 1320 

attaggtaaa ttaatcgctc ggaaaacctt tgaatggaat gataaatatt taggcaaaga 1380 

agccactaaa aaagattcac aaacaatagg tgatttacta gaaaaatttg cagaagagta 1440 

ttttaaaacc cataaacgca ccactaaaag cgaacatacc tttttttact atttttcccg 1500 

65 cacccaacga tataccaatt ccaaagattt agcaacggcg gaaaatctca tcaattcaat 1560 

tgagcaaatc gataaagaat gggcgagata taatgccgcc agagccatat cagctttttg 1620 

cataacattc aatatagaaa ttgatttgtc ccagtattcc aaaatgcctg atcgcaattc 1680 
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gcgcaacatc cccacagatg cagaaatact 
agttaccaga ggaaatcaag ttaatgaaga 
gacatatgga atgttagcag tttttggttt 
tattgattgg tggttaagca aagagaatat 
5 taaaactggt gaaagacaag cattaccctt 
aagaaatccg aaatatttag aaatgctggc 
tcatgctgaa ataacagcct taactcagcg 
agattttaaa ccctatgatt tacgtcacgc 
accaatcaaa gcggcggctg ataatttggg 

10 tcagcgctgg ttctcgctag atatgcggaa 
gaatgaattt gaggtgatta gggaggagaa 
gaggatggaa attgagaagt taaagatgga 
gtcgacctcg agatccaggc gcggatcaat 
gttggttttt tgtgtgcctt gggggagggg 

15 gggggaggcc agaatgacct tgggggaggg 
ggccagaatg aggcgcgccc ccgggtaccg 
cgtcgtgact gggaaaaccc tggcgttacc 
ttcgccagct ggcgtaatag cgaagaggcc 
agcctgaatg gcgaatggcg cctgatgcgg 

20 tcacaccgca tatggtgcac tctcagtaca 
ccccgacacc cgccaacacc cgctgacgcg 
gcttacagac aagctgtgac cgtctccggg 
tcaccgaaac gcgcgagacg aaagggcctc 
atgataataa tggtttctta gacgtcaggt 

25 cctatttgtt tatttttcta aatacattca 
tgataaatgc ttcaataata ttgaaaaagg 
gcccttattc ccttttttgc ggcattttgc 
gtgaaagtaa aagatgctga agatcagttg 
ctcaacagcg gtaagatcct tgagagtttt 

30 acttttaaag ttctgctatg tggcgcggta 
ctcggtcgcc gcatacacta ttctcagaat 
aagcatctta cggatggcat gacagtaaga 
gataacactg cggccaactt acttctgaca 
tttttgcaca acatggggga tcatgtaact 

35 gaagccatac caaacgacga gcgtgacacc 
cgcaaactat taactggcga act act tact 
atggaggcgg ataaagttgc aggaccactt 
attgctgata aatctggagc cggtgagcgt 
ccagatggta agccctcccg tatcgtagtt 

40 gatgaacgaa atagacagat cgctgagata 
tcagaccaag tttactcata tatactttag 
aggatctagg tgaagatcct ttttgataat 
tcgttccact gagcgtcaga ccccgtagaa 
tttctgcgcg taatctgctg cttgcaaaca 

45 ttgccggatc aagagctacc aactcttttt 
ataccaaata ctgtccttct agtgtagccg 
gcaccgccta catacctcgc tctgctaatc 
aagtcgtgtc ttaccgggtt ggactcaaga 
ggctgaacgg ggggttcgtg cacaqagccc 

50 agatacctac agcgtgagct atgagaaagc 
aggtatccgg taagcggcag ggtcggaaca 
aacgcctggt atctttatag tcctgtcggg 
ttgtgatgct cgtcaggggg gcggagccta 
cggttcctgg ccttttgctg gccttttgct 

55 tctgtggata accgtattac cgcctttgag 
accgagcgca gcgagtcagt gagcgaggaa 
ctccccgcgc gttggccgat tcattaatgc 
gcgggcagtg agcgcaacgc aattaatgtg 
ttacacttta tgcttccggc tcgtatgttg 

60 acaggaaaca gctatgacca tgattacgcc 
aggtttaaac 
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atcaggaatt accaaatttg aagactatct 1740 
tgtaaaagat agctggcaac tttggcgctg 1800 
acgccccagg gaaattttta ttaaccctaa 18 60 
agacctcaca tggaaagtag acaaagaatg 1920 
acataaagaa tggattgatg agtttgattt 1980 
aacagcaatt agtaaaaaag ataaaacaaa 2040 
tafctagttgg tggtttcgga aagtcgaatfc 2100 
ctgggcaatt agagcgcata ttttaggcat 2160 
gcatagtatg caggttcata cacaaaccta 2220 
gttagcgatt aatcaggctt tgactaagag 2280 
tgctaaattg cagatagaaa atgaaaggtt 2340 
aatagcttat aagaatagtt gagcggccgc 2400 
aaaagatcat tattttcaat agatctgtgt 24 60 
gaggccagaa tgaggcgcgg ccaaggggga 2520 
ggaggccaga atgaccttgg gggaggggga 2580 
agctcgaatt cactggccgt cgttttacaa 2640 
caacttaatc gccttgcagc acatccccct 2700 
cgcaccgatc gcccttccca acagttgcgc 27 60 
tattttctcc ttacgca-tct gtgcggtatt 2820 
atctgctctg atgccgcata gttaagccag 2880 
ccctgacggg cttgtctgct cccggcatcc 294 0 
agctgcatgt gtcagaggtt ttcaccgtca 3000 
gtgatacgcc tatttttata ggttaatgtc 3060 
ggcacttttc ggggaaatgt gcgcggaacc 3120 
aatatgtatc cgctcatgag acaataaccc 3180 
aagagtatga gtattcaaca tttccgtgtc 3240 
cttcctgttt ttgctcaccc agaaacgctg 3300 
ggtgcacgag tgggttacat cgaactggat 3360 
cgccccgaag aacgttttcc aatgatgagc 3420 
ttatcccgta ttgacgccgg gcaagagcaa 34 80 
gacttggttg agtactcacc agtcacagaa 3540 
gaattatgca gtgctgccat aaccatgagt 3600 
acgatcggag gaccgaagga gctaaccgct 3660 
cgccttgatc gttgggaacc ggagctgaat 3720 
acgatgcctg tagcaatggc aacaacgttg 3780 
ctagcttccc ggcaacaatt aatagactgg 3840 
ctgcgctcgg cccttccggc tggctggttt 3900 
gggtctcgcg gtatcattgc agcactgggg 3960 
atctacacga cggggagtca ggcaactatg 4020 
ggtgcctcac tgattaagca ttggtaactg 4080 
attgatttaa aacttcattt ttaatttaaa 4140 
ctcatgacca aaatccctta acgtgagttt 4200 
aagatcaaag gatcttcttg agatcctttt 4260 
aaaaaaccac cgctaccagc ggtggtttgt 4320 
ccgaaggtaa ctggcttcag cagagcgcag 4380 
tagttaggcc accacttcaa gaactctgta 44 40 
ctgttaccag tggctgctgc cagtggcgat 4500 
cgatagttac cggataaggc gcagcggtcg 4560 
agcttggagc gaacgaccta caccgaactg 4620 
gccacgcttc ccgaagggag aaaggcggac 4 680 
ggagagcgca cgagggagct tccaggggga 4740 
tttcgccacc tctgacttga gcgtcgattt 4800 
tggaaaaacg ccagcaacgc ggccttttta 48 60 
cacatgttct ttcctgcgtt atcccctgat 4920 
tgagctgata ccgctcgccg cagccgaacg 4 980 
gcggaagagc gcccaatacg caaaccgcct 5040 
agctggcacg acaggtttcc cgactggaaa 5100 
agttagctca ctcattaggc accccaggct 5160 
tgtggaattg tgagcggata acaatttcac 5220 
aagctagccc gggctagctt gcatgcctgc 5280 

5290 
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<220> 

<223> Description of Artificial Sequence : vector 
pCMVXisANNLS 

<400> 77 

aaacagtccg atgtacgggc cagatatacg cgttgacatt gattattgac tagttattaa 60 
tagtaatcaa ttacggggtc attagttcat agcccatata tggagttccg cgttacataa 120 
cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt gacgtcaata 180 

10 atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca atgggtggac 24 0 
tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc aagtacgccc 300 
cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta catgacctta 360 
tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac catggtgatg 420 
cggttttggc agtacatcaa tgggcgtgga tagcggtttg actcacgggg atttccaagt 480 

15 ctccacccca ttgacgtcaa tgggagtttg ttttggcacc aaaatcaacg ggactttcca 540 
aaatgtcgta acaactccgc cccattgacg caaatgggcg gtaggcgtgt acggtgggag 600 
gtctatataa gcagagctct ctggctaact agagaaccca ctgcttactg gcttatcgaa 660 
attaatacga ctcactatag ggagacccaa gctgactcta gacttaatta agcgttgggg 720 
tgagtactcc ctctcaaaag cgggcatgac ttctgcgcta agattgtcag tttccaaaaa 780 

20 cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 
ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
gccatacact tgagtgacat tgacatccac tttgcctttc tctccacagg tgtccactcc 960 
cagggcggcc gcaccatgcc caagaagaag aggaaggtgc aaaatcaggg tcaagacaaa 1020 
tatcaacaag cctttgcaga cttagagcca ctttcatcta ccgacggcag ttttctcggc 1080 

25 tcaagtctgc aagcacagca gcaaagagaa cacatgagaa caaaagtact acaagaccta 1140 ■ 
gacaaggtaa atctgcgttt gaagtctgca aagacgaaag tctcagttcg agaatctaac 1200 
ggaagtctgc aattacgagc aacgttacca attaaacctg gagataagga caccaacggt 1260 
acaggcagaa agcaatacaa tctcagcttg aatatccctg caaacttgga tggactgaag 1320 
acggctgagg aagaagctta tgaattaggt aaattaatcg ctcggaaaac ctttgaatgg 1380 

30 aatgataaat atttaggcaa agaagccact aaaaaagatt cacaaacaat aggtgattta 1440 
ctagaaaaat ttgcagaaga gtattttaaa acccataaac gcaccactaa aagcgaacat 1500 
accttttttt actatttttc ccgcacccaa cgatatacca attccaaaga tttagcaacg 1560 
gcggaaaatc tcatcaattc aattgagcaa atcgataaag aatgggcgag atataatgcc 1620 
gccagagcca tatcagcttt ttgcataaca ttcaatatag aaattgattt gtcccagtat 1680 

35 tccaaaatgc ctgatcgcaa ttcgcgcaac atccccacag atgcagaaat actatcagga 1740 
attaccaaat ttgaagacta tctagttacc agaggaaatc aagttaatga agatgtaaaa 1800 
gatagctggc aactttggcg ctggacatat ggaatgttag cagtttttgg tttacgcccc 18 60 
agggaaattt ttattaaccc taatattgat tggtggttaa gcaaagagaa tatagacctc 1920 
acatggaaag tagacaaaga atgtaaaact ggtgaaagac aagcattacc cttacataaa 1980 

40 gaatggattg atgagtttga tttaagaaat ccgaaatatt tagaaatgct ggcaacagca 2040 
attagtaaaa aagataaaac aaatcatgct gaaataacag ccttaactca gcgtattagt 2100 
tggtggtttc ggaaagtcga attagatttt aaaccctatg atttacgtca cgcctgggca 2160 
atcagagcgc atattttagg cataccaatc aaagcggcgg ctgataattt ggggcatagt 2220 
atgcaggttc atacacaaac ctatcagcgc tggttctcgc tagatatgcg gaagttagcg 2280 

45 attaatcagg ctttgactaa gaggaatgaa tttgaggtga ttagggagga gaatgctaaa 2340 
ttgcagatag aaaatgaaag gttgaggatg gaaattgaga agttaaagat ggaaatagct 2400 
tataagaata gttgagcggc cgcgtcgacc tcgagatcca ggcgcggatc aataaaagat 2460 . 
cattattttc aatagatctg tgtgttggtt ttttgtgtgc cttgggggag ggggaggcca 2520 
gaatgaggcg cggccaaggg ggagggggag gccagaatga ccttggggga gggggaggcc 2580 

50 agaatgacct tgggggaggg ggaggccaga atgaggcgcg cccccgggta ccgagctcga 2640 
attcactggc cgtcgtttta caacgtcgtg actgggaaaa ccctggcgtt acccaactta 2700 
atcgccttgc agcacatccc cctttcgcca gctggcgtaa tagcgaagag gcccgcaccg 2760 
atcgcccttc ccaacagttg cgcagcctga atggcgaatg gcgcctgatg cggtattttc 2820 
tccttacgca tctgtgcggt atttcacacc gcatatggtg cactctcagt acaatctgct 28 80 

55 ctgatgccgc atagttaagc cagccccgac acccgccaac acccgctgac gcgccctgac 2940 
gggcttgtct gctcccggca tccgcttaca gacaagctgt gaccgtctcc gggagctgca 3000 
tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag acgaaagggc ctcgtgatac 3060 
gcctattttt ataggttaat gtcatgataa taatggtttc ttagacgtca ggtggcactt 3120 
ttcggggaaa tgtgcgcgga acccctattt gtttattttt ctaaatacat tcaaatatgt 3180 

60 atccgctcat gagacaataa ccctgataaa tgcttcaata atattgaaaa aggaagagta 3240 
tgagtattca acatttccgt gtcgccctta ttcccttttt tgcggcattt tgccttcctg 3300 
tttttgctca cccagaaacg ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac 3360 
gagtgggtta catcgaactg gatctcaaca gcggtaagat ccttgagagt tttcgccccg 3420 
aagaacgttt tccaatgatg agcactttta aagttctgct atgtggcgcg gtattatccc 3480 

65 gtattgacgc cgggcaagag caactcggtc gccgcataca ctattctcag aatgacttgg 3540 
ttgagtactc accagtcaca gaaaagcatc ttacggatgg catgacagta agagaattat 3600 
gcagtgctgc cataaccatg agtgataaca ctgcggccaa cttacttctg acaacgatcg 3660 
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gaggaccgaa ggagctaacc gcttttttgc acaacatggg ggatcatgta actcgccttg 3720 

atcgttggga accggagctg aatgaagcca taccaaacga cgagcgtgac accacgatgc 3780 

ctgtagcaat ggcaacaacg ttgcgcaaac tattaactgg cgaactactt actctagctt 3840 

cccggcaaca attaatagac tggatggagg cggataaagt tgcaggacca cttctgcgct 3900 

5 cggcccttcc ggctggctgg tttattgctg ataaatctgg agccggtgag cgtgggtctc 3960 

gcggtatcat tgcagcactg gggccagatg gtaagccctc ccgtatcgta gttatctaca 4020 

cgacggggag tcaggcaact atggatgaac gaaatagaca gatcgctgag ataggtgcct 4080 

cactgattaa gcattggtaa ctgtcagacc aagtttactc atatatactt tagattgatt 414 0 

taaaacttca tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga 4200 

10 ccaaaatccc ttaacgtgag ttttcgtt cc actgagcgtc agaccccgta gaaaagatca 4260 

aaggatcttc ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac 4320 

caccgctacc agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg 4380 

taactggctt cagcagagcg cagataccaa atactgtcct tctagtgtag ccgtagttag 4440 

gccaccactt caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac 4500 

15 cagtggctgc tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt 4560 

taccggataa ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg 4 620 

agcgaacgac ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc 4 680 

ttcccg'aagg gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc 4740 

gcacgaggga gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc 4800 

20 acctctgact tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa 48 60 

acgccagcaa cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt 4 920 

tctttcctgc gttatcccct gattctgtgg ataaccgtat taccgccttt gagtgagctg 4980 

ataccgctcg ccgcagccga acgaccgagc gcagcgagtc agtgagcgag gaagcggaag 5040 

agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 5100 

25 acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat gtgagttagc 5160 

tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 5220 

ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagctag 5280 

cccgggctag cttgcatgcc tgcaggttt 5309 

30 

<210> 78 
<211> 7608 
<212> DNA 

<213> Artificial Sequence 

35 

<220> 

<223> Description of Artificial Sequence: vector 
pPGKnifD 

40 <400> 78 

tcgaggaatt ctaccgggta ggggaggcgc ttttcccaag gcagtctgga gcatgcgctt 60 
tagcagcccc gctgggcact tggcgctaca caagtggcct ctggctcgca cacattccac 120 
atccaccggt aggcgccaac cggctccgtt ctttggtggc cccttcgcgc caccttctac 180 
tcctccccta gtcaggaagt tcccccccgc cccgcagctc gcgtcgtgca ggacgtgaca 240 

45 aatggaagta gcacgtctca ctagtctcgt gcagatggac agcaccgctg agcaatggaa 300 
gcgggtaggc ctttggggca gcggccaata gcagctttgc tccttcgctt tctgggctca 360 
gaggctggga aggggtgggt ccgggggcgg gctcaggggc gggctcaggg gcggggcggg 420 
cgcccgaagg tcctccggag gcccggcatt ctgcacgctt caaaagcgca cgtctgccgc 480 
gctgttctcc tcttcctcat ctccgggcct ttcgacctgc agcccggtac agttcgaatg 540 

50 gctcttccct tccgtcaaat gcactcttgg gattactccg aacctagcga tggggtgcaa 600 
atgtcagatc agataagttc gaataacttc gtatagcata cattatacga agttataagc 660 
ttgcatgcct gcaggtcggc cgccacgacc ggccggccgg tgccgccacc atcccctgac 720 
ccacgccGCt gacccctcac aaggagacga ccttccatga ccgagtacaa gcccacggtg 780 
cgcctcgcca cccgcgacga cgtqccccgg gccgtacgca ccctcgccgc cgcgttcgcc 840 

55 gactaccccg ccacgcgcca caccgtcgac ccggaccgcc acatcgagcg ggtcaccgag 900 
ctgcaagaac tcttcctcac gcgcgtcggg ctcgacatcg gcaaggtgtg ggtcgcggac 960 
gacggcgccg cggtggcggt ctggaccacg ccggagagcg tcgaagcggg ggcggtgttc 1020 
gccgagatcg gcccgcgcat ggccgagttg agcggttccc ggctggccgc gcagcaacag 1080 
atggaaggcc tcctggcgcc gcaccggccc aaggagcccg cgtggttcct ggccaccgtc ■ 1140 

60 ggcgtctcgc ccgaccacca gggcaagggt ctgggcagcg ccgtcgtgct ccccggagtg 1200 
gaggcggccg agcgcgccgg ggtgcccgcc ttcctggaga cctccgcgcc ccgcaacctc 1260 
cccttctacg agcggctcgg cttcaccgtc accgccgacg tcgagtgccc gaaggaccgc 1320 
gcgacctggt gcatgacccg caagcccggt gcctgacgcc cgccccacga cccgcagcgc 1380 
ccgaccgaaa ggagcgcacg accccatggc tccgaccgaa gccgacccgg gcggccccgc 1440 

65 cgaccccgca cccgcccccg aggcccaccg actctagagg atcataatca gccataccac 1500 
atttgtagag gttttacttg ctttaaaaaa cctcccacac ctccccctga acctgaaaca 1560 
taaaatgaat gcaattgttg ttgttaactt gtttattgca gcttataatg gttacaaata 1620 
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aagcaatagc atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg 1680 

tttgtccaaa ctcatcaatg tatcttatca tgtctggatc cagctgttga aagctattaa 1740 

accacaaaaa ggattactcc ggcccttatc acggttacga cggatttgga tccataactt 1800 

cgtatagcat acattatacg aagttatacc gggccaccat ggtcgcgagt agcttggcac 18 60 

5 tggccgtcgt tttacaacgt cgtgactggg aaaaccctgg cgttacccaa cttaatcgcc 1920 

ttgcagcaca tccccctttc gccagctggc gtaatagcga agaggcccgc accgatcgcc 1980 

cttcccaaca gttgcgcagc ctgaatggcg aatggcgctt tgcctggttt ccggcaccag 2040 

aagcggtgcc ggaaagctgg ctggagtgcg atcttcctga ggccgatact gtcgtcgtcc 2100 

cctcaaactg gcagatgcac ggttacgatg cgcccatcta caccaacgta acctatccca 2160 

10 ttacggtcaa tccgccgttt gttcccacgg agaatccgac gggttgttac tcgctcacat 2220 

ttaatgttga tgaaagctgg ctacaggaag gccagacgcg aattattttt gatggcgtta 2280 

actcggcgtt tcatctgtgg tgcaacgggc gctgggtcgg ttacggccag gacagtcgtt 2340 

tgccgtctga atttgacctg agcgcatttt tacgcgccgg agaaaaccgc ctcgcggtga 2400 

tggtgctgcg ttggagtgac ggcagttatc tggaagatca ggatatgtgg cggatgagcg 24 60 

15 gcattttccg tgacgtctcg ttgctgcata aaccgactac acaaatcagc gatttccatg 2520 " 

ttgccactcg ctttaatgat gatttcagcc gcgctgtact ggaggctgaa gttcagatgt 2580 

gcggcgagtt gcgtgactac ctacgggtaa cagtttcttt atggcagggt gaaacgcagg 264 0 

tcgccagcgg caccgcgcct ttcggcggtg aaattatcga tgagcgtggt ggttatgccg 2700 

atcgcgtcac actacgtctg aacgtcgaaa acccgaaact gtggagcgcc gaaatcccga 27 60 

20 atctctatcg tgcggtggtt gaactgcaca ccgccgacgg cacgctgatt gaagcagaag 2820 

cctgcgatgt cggtttccgc gaggtgcgga ttgaaaatgg tctgctgctg ctgaacggca 2880 

agccgttgct gattcgaggc gttaaccgtc acgagcatca tcctctgcat ggtcaggtca 2940 

tggatgagca gacgatggtg caggatatcc tgctgatgaa gcagaacaac fcttaacgccg 3000 

tgcgctgttc gcattatccg aaccatccgc tgtggtacac gctgtigcgac cgctacggcc 3060 

25 tgtatgtggt ggatgaagcc aatattgaaa cccacggcat ggtgccaatg aatcgtctga 3120 

ccgatgatcc gcgctggcta ccggcgatga gcgaacgcgt aacgcgaatg gtgcagcgcg 3180 

atcgtaatca cccgagtgtg atcatctggt cgctggggaa tgaatcaggc cacggcgcta 3240 

atcacgacgc gctgtatcgc tggatcaaat ctgtcgatcc ttcccgcccg gtgcagtatg 3300 

aaggcggcgg agccgacacc acggccaccg atattatttg cccgatgtac gcgcgcgtgg 3360 

30 atgaagacca gcccttcccg gctgtgccga aatggtccat caaaaaatgg ctttcgctac 3420 

ctggagagac gcgcccgctg atcctttgcg aatacgccca cgcgatgggt aacagtcttg 3480 

gcggtttcgc taaatactgg caggcgtttc gtcagtatcc ccgtttacag ggcggcttcg 3540 

tctgggactg ggtggatcag tcgctgatta aatatgatga aaacggcaac ccgtggtcgg 3600 

cttacggcgg tgattttggc gatacgccga acgatcgcca gttctgtatg aacggtctgg 3660 

35 tctttgccga ccgcacgccg catccagcgc tgacggaagc aaaacaccag cagcagtttt 3720 

tccagttccg tttatccggg caaaccatcg aagtgaccag cgaatacctg ttccgtcata 3780 

gcgataacga gctcctgcac tggatggtgg cgctggatgg taagccgctg gcaagcggtg 3840 

aagtgcctct ggatgtcgct ccacaaggta aacagttgat tgaactgcct gaactaccgc 3900 

agccggagag cgccgggcaa ctctggctca cagtacgcgt agtgcaaccg aacgcgaccg 3960 

40 catggtcaga agccgggcac atcagcgcct ggcagcagtg gcgtctggcg gaaaacctca 4020 

gtgtgacgct ccccgccgcg tcccacgcca tcccgcatct gaccaccagc gaaatggatt 4080 

tttgcatcga gctgggtaat aagcgttggc aatttaaccg ccagtcaggc tttctttcac 414 0 

agatgtggat tggcgataaa aaacaactgc tgacgccgct gcgcgatcag ttcacccgtg 4200 

caccgctgga taacgacatt ggcgtaagtg aagcgacccg cattgaccct aacgcctggg 4260 

45 tcgaacgctg gaaggcggcg ggccattacc aggccgaagc agcgttgttg cagtgcacgg 4320 

cagatacact tgctgatgcg gtgctgatta cgaccgctca cgcgtggcag catcagggga 4380 

aaaccttatt tatcagccgg aaaacctacc ggattgatgg tagtggtcaa atggcgatta 4440 

ccgttgatgt tgaagtggcg agcgatacac cgcatccggc gcggattggc ctgaactgcc 4500 

agctggcgca ggtagcagag cgggtaaact ggctcggatt agggccgcaa gaaaactatc 4560 

50 ccgaccgcct tactgccgcc tgttttgacc gctgggatct gccat-tgtca gacatgtata 4 620 

ccccgtacgt cttcccgagc gaaaacggtc tgcgctgcgg gacgcgcgaa ttgaattatg 4 680 

gcccacacca gtggcgcggc gacttccagt tcaacatcag ccgctacagt caacagcaac 4740 

tgatggaaac cagccatcgc catctgctgc acgcggaaga aggcacatgg ctgaatatcg 4800 

acggtttcca tatggggatt ggtggcgacg actcctggag cccgtcagta tcggcggaat 4860 

55 tccagctgag cgccggtcgc taccattacc agttggtctg gtgtcaaaaa taataataac 4 920 

cgggcagggg ggatptttgt gaaggaacct tacttctgtg gtgtgacata attggacaaa 4980 

ctacctacag agatttaaag ctctaaggta aatataaaat ttttaagtgt ataatgtgtt 5040 

aaactactga ttctaattgt ttgtgtattt tagattccaa cctatggaac tgatgaatgg 5100 

gagcagtggt ggaatgccag atccagacat gataagatac attgatgagt ttggacaaac 5160 

60 cacaactaga atgcagtgaa aaaaatgctt tatttgtgaa atttgtgatg ctattgcttt 5220 

atttgtaacc at tat a a get; gcaataaaca agttaacaac aacaattgca ttcattttat 5280 

gtttcaggtt cagggggagg tgtgggaggt tttttaaagc aagtaaaacc tctacaaatg 5340 

tggtatggct gattatgatc tgcggccgca gggcctcgtg atacgcctat ttttataggt 5400 

taatgtcatg ataataatgg tttcttagac gtcaggtggc acttttcggg gaaatgtgcg 54 60 

65 cggaacccct atttgtttat ttttctaaat acattcaaat atgtatccgc tcatgagaca 5520 

ataaccctga taaatgcttc aataatattg aaaaaggaag agtatgagta ttcaacattt 5580 

ccgtgtcgcc cttattccct tttttgcggc attttgcctt cctgtttttg ctcacccaga 5640 



SUBSTITUTE SHEET (RULE 26) 



WO 02/38613 PCT/EP01/12975 

70 

aacgctggtg aaagtaaaag atgctgaaga tcagttgggt gcacgagtgg gttacatcga. 5700 
actggatctc aacagcggta agatccttga gagttttcgc cccgaagaac gttttccaat 57 60 
gatgagcact tttaaagttc tgctatgtgg cgcggtatta tcccgtattg acgccgggca 5820 
agagcaactc ggtcgccgca tacactattc tcagaatgac ttggttgagt actcaccagt 5880 
5 cacagaaaag catcttacgg atggcatgac agtaagagaa ttatgcagtg ctgccataac 594 0 
catgagtgat aacactgcgg ccaacttact tctgacaacg atcggaggac cgaaggagct 6000 
aaccgctttt ttgcacaaca tgggggatca tgtaactcgc cttgatcgtt gggaaccgga 60 60 
gctgaatgaa gccataccaa acgacgagcg tgacaccacg atgcctgtag caatggcaac 6120 
aacgttgcgc aaactattaa ctggcgaact acttactcta gcttcccggc aacaattaat 6180 

10 agactggatg gaggcggata aagttgcagg accacttctg cgctcggccc ttccggctgg 6240 
ctggtttatt gctgataaat ctggagccgg tgagcgtggg tctcgcggta tcattgcagc 6300 
actggggcca gatggtaagc cctcccgtat cgtagttatc tacacgacgg ggagtcaggc 6360 
aactatggat gaacgaaata gacagatcgc tgagataggt gcctcactga ttaagcattg 6420 
gtaactgtca gaccaagttt actcatatat actttagatt gatttaaaac ttcattttta 6480 

15 atttaaaagg atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg 6540 
tgagttttcg ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga 6600 
tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt 6660 
ggtttgtttg ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag 6720 
agcgcagata ccaaatactg tccttctagt gtagccgtag ttaggccacc acttcaagaa 67 80 

20 ctctgtagca ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag 68 40 
tggcgataag tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca 6900 
gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac 6960 
cgaactgaga tacctacagc gtgagctatg agaaagcgcc acgcttcccg aagggagaaa 7020 
ggcggacagg tatccggtaa gcggc'agggt cggaacagga gagcgcacga gggagcttcc 708 0 

25 agggggaaac gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg 714 0 
tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc 7200 
ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc ctgcgttatc 72 60 
ccctgattct gtggataacc gtattaccgc ctttgagtga gctgataccg ctcgccgcag 7320 
ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa 7380 

30 accgcctctc cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga 744 0 
ctggaaagcg ggcagtgagc gcaacgcaat taatgtgagt tagctcactc attaggcacc 7500 
ccaggcttta cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca 75 60 
atttcacaca ggaaacagct atgaccatga ttacgccaag ctggcgcg 7 608 

35 

<210> 79 
<211> 7523 
<212> DNA 

<213> Artificial Sequence 

40 

<220> 

<223> Description of Artificial Sequence: vector 
pPGKnifD3 T 

45 <400> 79 

tcgaggaatt ctaccgggta ggggaggcgc 

tagcagcccc gctgggcact tggcgctaca 

atccaccggt aggcgccaac cggctccgtt 

tcctccccta gtcaggaagt tcccccccgc 
50 aatggaagta gcacgtctca ctagtctcgt 

gcgggtaggc ctttggggca gcggccaata 

gaggctggga aggggtgggt ccgggggcgg 

cgcccgaagg tcctccggag gcccggcatt 

gctgttctcc tcttcctcat ctccgggcct 
55 acttcgtata gcatacatta tacgaagtta 

cgaccggccg gccggtgccg ccaccatccc 

gacgaccttc catgaccgag tacaagccca 

cccgggccgt acgcaccctc gccgccgcgt 

tcgacccgga ccgccacatc gagcgggtca 
60 tcgggctcga catcggcaag gtgtgggtcg 

ccacgccgga gagcgtcgaa gcgggggcgg 

agttgagcgg ttcccggctg gccgcgcagc 

ggcccaagga gcccgcgtgg ttcctggcca 

agggtctggg cagcgccgtc gtgctccccg 
65 ccgccttcct ggagacctcc gcgccccgca 

ccgtcaccgc cgacgtcgag tgcccgaagg 

ccggtgcctg acgcccgccc cacgacccgc 



ttttcccaag gcagtctgga gcatgcgctt 60 
caagtggcct ctggctcgca cacattccac 120 
ctttggtggc cccttcgcgc caccttctac 180 
cccgcagctc gcgtcgtgca ggacgtgaca 240 
gcagatggac agcaccgctg agcaatggaa 300 
gcagctttgc tccttcgctt tctgggctca 360 
gctcaggggc gggctcaggg gcggggcggg 420 
ctgcacgctt caaaagcgca cgtctgccgc 480 
ttcgacctgc agcccggtac agttcgaata 540 
taagcttgca tgcctgcagg tcggccgcca 600 
ctgacccacg cccctgaccc ctcacaagga 660 
cggtgcgcct cgccacccgc gacgacgtcc 720 
tcgccgacta ccccgccacg cgccacaccg 780 
ccgagctgca agaactcttc ctcacgcgcg 840 
cggacgacgg cgccgcggtg gcggtctgga 900 
tgttcgccga gatcggcccg cgcatggccg 960 
aacagatgga aggcctcctg gcgccgcacc 1020 
ccgtcggcgt ctcgcccgac caccagggca 1080 
gagtggaggc ggccgagcgc gccggggtgc 1140 
acctcccctt ctacgagcgg ctcggcttca 1200 
accgcgcgac ctggtgcatg acccgcaagc 1260 
agcgcccgac cgaaaggagc gcacgacccc 1320 
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atggctccga ccgaagccga cccgggcggc 
caccgactct agaggatcat aatcagccat 
aaaaacctcc cacacctccc cctgaacctg 
aacttgttta ttgcagctta taatggttac 
5 aataaagcat ttttttcact gcattctagt 
tatcatgtct ggatccagct gttgaaagct 
ttatcacggt tacgacggat ttggatccat 
ataccgggcc accatggtcg cgagtagctt 
ctgggaaaac cctggcgtta cccaacttaa 

10 ctggcgtaat agcgaagagg cccgcaccga 
tggcgaatgg cgctttgcct ggtttccggc 
gtgcgatctt cctgaggccg atactgtcgt 
cgatgcgccc atctacacca acgtaaccta 
cacggagaat ccgacgggtt gttactcgct 

15 ggaaggccag acgcgaatta tttttgatgg 
cgggcgctgg gtcggttacg gccaggacag 
atttttacgc gccggagaaa accgcctcgc 
ttatctggaa gatcaggata tgtggcggat 
gcataaaccg actacacaaa tcagcgattt 

20 cagccgcgct gtactggagg ctgaagttca 
ggtaacagtt tctttatggc agggtgaaac 
cggtgaaatt atcgatgagc gtggtggtta 
cgaaaacccg aaactgtgga gcgccgaaat 
gcacaccgcc gacggcacgc tgattgaagc 

25 gcggattgaa aatggtctgc tgctgctgaa 
ccgtcacgag catcatcctc tgcatggtca 
tatcctgctg atgaagcaga acaactttaa 
tccgctgtgg tacacgctgt gcgaccgcta 
tgaaacccac ggcatggtgc caatgaatcg 

30 gatgagcgaa cgcgtaacgc gaatggtgca 
ctggtcgctg gggaatgaat caggccacgg 
caaatctgtc gatccttccc gcccggtgca 
caccgatatt atttgcccga tgtacgcgcg 
gccgaaatgg tccatcaaaa aatggctttc 

35 ttgcgaatac gcccacgcga tgggtaacag 
gtttcgtcag tatccccgtt tacagggcgg 
gattaaatat gatgaaaacg gcaacccgtg 
gccgaacgat cgccagttct gtatgaacgg 
agcgctgacg gaagcaaaac accagcagca 

40 catcgaagtg accagcgaat acctgttccg 
ggtggcgctg gatggtaagc cgctggcaag 
aggtaaacag ttgattgaac tgcctgaact 
gctcacagta cgcgtagtgc aaccgaacgc 
cgcctggcag cagtggcgtc tggcggaaaa 

45 cgccatcccg catctgacca ccagcgaaat 
ttggcaattt aaccgccagt caggctttct 
actgctgacg ccgctgcgcg atcagttcac 
aagtgaagcg acccgcattg accctaacgc 
ttaccaggcc gaagcagcgt tgttgcagtg 

50 gattacgacc gctcacgcgt ggcagcatca 
ctaccggatt gatggtagtg gtcaaatggc 
tacaccgcat ccggcgcgga ttggcctgaa 
aaactggctc ggattagggc cgcaagaaaa 
tgaccgctgg gatctgccat tgtcagacat 

55 cggtctgcgc tgcgggacgc gcgaattgaa 
ccagttcaac atcagccgct acagtcaaca 
gctgcacgcg gaagaaggca catggctgaa 
cgacgactcc tggagcccgt cagtatcggc 
ttaccagttg gtctggtgtc aaaaataata 

60 aaccttactt ctgtggtgtg acataattgg 
aggtaaatat aaaattttta agtgtataat 
tattttagat tccaacctat ggaactgatg 
gacatgataa gatacattga tgagtttgga 
tgctttattt gtgaaatttg tgatgctatt 

65 aaacaagtta acaacaacaa ttgcattcat 
gaggtttttt aaagcaagta aaacctctac 
ccgcagggcc tcgtgatacg cctattttta 
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cccgccgacc ccgcacccgc ccccgaggcc 1380 
accacatttg tagaggtttt acttgcttta 1440 
aaacataaaa tgaatgcaat tgttgttgtt 1500 
aaataaagca atagcatcac aaatttcaca 1560 
tgtggtttgt ccaaactcat caatgtatct 1620 
attaaaccac aaaaaggatt actccggccc 1680 
aacttcgtat agcatacatt atacgaagtt 1740 
ggcactggcc gtcgttttac aacgtcgtga 18 00 
tcgccttgca gcacatcccc ctttcgccag 1860 
tcgcccttcc caacagttgc gcagcctgaa 1920 
accagaagcg gtgccggaaa gctggctgga 1980 
cgtcccctca aactggcaga tgcacggtta 2040 
tcccattacg gtcaatccgc cgtttgttcc 2100 
cacatttaat gttgatgaaa gctggctaca 2160 
cgttaactcg gcgtttcatc tgtggtgcaa 2220 
tcgtttgccg tctgaatttg acctgagcgc 22 80 
ggtgatggtg ctgcgttgga gtgacggcag 2340 
gagcggcatt ttccgtgacg tctcgttgct 2400 
ccatgttgcc actcgcttta atgatgattt 24 60- 
gatgtgcggc gagttgcgtg actacctacg 2520 
gcaggtcgcc agcggcaccg cgcctttcgg 2580 
tgccgatcgc gtcacactac gtctgaacgt 2640 
cccgaatctc tatcgtgcgg tggttgaact 2700 
agaagcctgc gatgtcggtt tccgcgaggt 27 60 
cggcaagccg ttgctgattc gaggcgttaa 2820 ■ 
ggtcatggat gagcagacga tggtgcagga 2880 
cgccgtgcgc tgttcgcatt atccgaacca 2940 
cggcctgtat gtggtggatg aagccaatat 3000 
tctgaccgat gatccgcgct ggctaccggc 3060 
gcgcgatcgt aatcacccga gtgtgatcat 3120 
cgctaatcac gacgcgctgt atcgctggat 3180 
gtatgaaggc ggcggagccg acaccacggc 3240 
cgtggatgaa gaccagccct tcccggctgt 3300 
gctacctgga gagacgcgcc cgctgatcct 3360 
tcttggcggt ttcgctaaat actggcaggc 3420 
cfctcgtctgg gactgggtgg atcagtcgct 3480 
gtcggcttac ggcggtgatt ttggcgatac 3540 
tctggtcttt gccgaccgca cgccgcatcc 3600 
gtttttccag ttccgtttat ccgggcaaac 3660 
tcatagcgat aacgagct.cc tgcactggat 3720 
cggtgaagtg cctctggatg tcgctccaca 3780 
accgcagccg gagagcgccg ggcaactctg 38 40 
gaccgcatgg tcagaagccg ggcacatcag 3900 
cctcagtgtg acgctccccg ccgcgtccca 3960 
ggatttttgc atcgagctgg gtaataagcg 4020 
ttcacagatg tggattggcg ataaaaaaca 4080 
ccgtgcaccg ctggataacg acattggcgt 4140 
ctgggtcgaa cgctggaagg cggcgggcca 4200 
cacggcagat acacttgctg atgcggtgct 4260 
ggggaaaacc ttatttatca gccggaaaac 4320 
gattaccgtt gatgttgaag tggcgagcga 4380 
ctgccagctg gcgcaggtag cagacjcgggt 4440 
ctatcccgac cgccttactg ccgcct.gtt.t. 4500 
gtataccccg tacgtcttcc cgagcgaaaa 4560 
ttatggccca caccagtggc gcggcgactt 4 620 
gcaactgatg gaaaccagcc atcgccatct 4680 
tatcgacggt ttccatatgg ggattggtgg 4740 
ggaattccag ctgagcgccg gtcgctacca 4800 
ataaccgggc aggggggatc tttgtgaagg 48 60 
acaaactacc tacagagatt taaagctcta 4 920 
gtgttaaact actgattcta attgtttgtg 4 980 
aatgggagca gtggtggaat gccagatcca 5040 
caaaccacaa ctagaatgca gtgaaaaaaa 5100 
gctttatttg taaccattat aagctgcaat 5160 
tttatgtttc aggttcaggg ggaggtgtgg 5220 
aaatgtggta tggctgatta tgatctgcgg 5280 
taggttaatg toatgataat aatggtttct 5340 
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tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg tttatttttc 5400 

taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat gcttcaataa 54 60 

tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat tccctttttt 5520 

gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct 5580 

5 gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag cggtaagatc 5640 

cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa agttctgcta 5700 

tgtggcgcgg tattatcccg tattgacgcc gggcaagagc aactcggtcg ccgcatacac 57 60 

tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct tacggatggc 5820 

atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac tgcggccaac 5880 

10 ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca caacatgggg 5940 

gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat accaaacgac 6000 

gagcgtgaca ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact attaactggc 6060 

gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc ggataaagtt 6120 

gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga taaatctgga 6180 

15 gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg taagccctcc 6240 

cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg aaatagacag 6300 

atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca agtttactca 6360 

tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc 6420 

ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca 64 80 

20 gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc 6540 

tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta 6600 

ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt 6660 

ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc 6720 

gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg 6780 

25 ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg 68 40 

tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag 6900 

ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc 6960 

agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat 7020 

agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg 7080 

30 gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc 7140 

tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt 7200 

accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca 7260 

gtgagcgagg aagcggaaga gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg 7320 

attcattaat gcagctggca cgacaggttt cccgactgga aagcgggcag tgagcgcaac 7380 

35 gcaattaatg tgagttagct cactcattag gcaccccagg ctttacactt tatgcttccg 74 40 

gctcgtatgt tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagctatgac 7500 

catgattacg ccaagctggc gcg 7523 



40 <210> 80 

<211> 4754 
<212> DNA 

<213> Artificial Sequence 
45 <220> 

<223> Description of Artificial Sequence: vector 
pRK55pbsC31-Re 

<400> 80 
50 ggccgcccga tatgacacaa 

accgtcagtc gcgcgagcgc 

ccaacgaaga caaggcggcc 

ggttcgtcgg gcatttcagc 

cggagttcga acgcatcctg 
55 atgacgtgtc gcgcttctcg 

tgctcgccct gggcgtgacg 

tcatggacct gattcacctg 

agtcggcgaa gattctcgac 

ggaaggcgcc ttacggcttc 
60 gaatggtcaa tgtcgtcatc 

tcgagttcga gcccgacgta 

ttcccttcaa gccgggcagt 

agcgcatgga cgctgacgcc 

caagcgcctg ggacccggca 
65 tcgccgctga ggtgatctac 

gttaccgcat tcagcgcgac 

tcatcgagcc cgctgagtgg 



ggggttgtga ccggggtgga cacgtacgcg ggtgcttacg 60 
gagaattcga gcgcagcaag cccagcgaca cagcgtagcg 120 
gaccttcagc gcgaagtcga gcgcgacggg ggccggttca 180 
gaagcgccgg gcacgtcggc gttcgggacg gcggagcgcc 240 
aacgaatgcc gcgccgggcg gctcaacatg atcattgtct 300 
cgcctgaagg tcatggacgc gattccgatt gtctcggaat 360 
attgtttcca ctcaggaagg cgtcttccgg cagggaaacg 420 
attatgcggc tcgacgcgtc gcacaaagaa tcttcgctga 480 
acgaagaacc ttcagcgcga attgggcggg tacgtcggcg 540 
gagcttgttt cggagacgaa ggagatcacg cgcaacggcc 600 
aacaagcttg cgcactcgac cactcccctt accggaccct 660 
atccggtggt ggtggcgtga gatcaagacg cacaaacacc 720 
caagccgcca ttcacccggg cagcatcacg gggctttgta 780 
gtgccgaccc ggggcgagac gattgggaag aagaccgctt 840 
accgttatgc gaatccttcg ggacccgcgt attgcgggct 900 
aagaagaagc cggacggcac gccgaccacg aagattgagg 960 
ccgatcacgc tccggccggt cgagcttgat tgcggaccga 1020 
tatgagcttc aggcgtggtt ggacggcagg gggcgcggca 1080 
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aggggctttc 
gcgccgtcat 
gccggaaggt 
tggcggcact 
gcgacgaaga 
aggcgcctga 
acgcccttga 
ggaagcactt 
ggcttgccga 
aagacgccga 
acaagcgcgt 
gcagggggca 
ccgacgacga 
ccgggctcga 
agcttggcgt 
ccacacaaca 
taactcacat 
cagctgcatt 
tccgcttcct 
gctcactcaa 
atgtgagcaa 
ttccataggc 
cgaaacccga 
tctcctgttc 
gtggcgcttt 
aagctgggct 
tatcgtcttg 
aacaggatta 
aactacggct 
ttcggaaaaa 
ttttttgttt 
atcttttcta 
atgagattat 
tcaatctaaa 
gcacctatct 
tagataacta 
gacccacgct 
cgcagaagtg 
gctagagtaa 
atcgtggtgt 
aggcgagtta 
atcgttgtca 
aattctctta 
aagtcattct 
gataataccg 
gggcgaaaac 
gcacccaact 
ggaaggcaaa 
ctcttccttt 
atatttgaat 
gtgccaccta 
tcagctcatt 
agaccgagat 
tggactccaa 
catcacccta 
aagggagccc 
ggaagaaagc 
taaccaccac 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
tccaccgcgg 



ccgggggcaa 
gacttcgaag 
ggtcgacccg 
cgacaagttc 
gacgttggcg 
gaagagcggc 
agagctgtac 
ccggaagcaa 
acttgaagcc 
cgctgacccg 
gttcgtcggg 
gggaacgccc 
cgaagacgac 

gggggggccc 

aatcatggtc 
tacgagccgg 
taattgcgtt 
aatgaatcgg 
cgctcactga 
aggcggtaat 
aaggccagca 
tccgcccccc 
caggactata 
cgaccctgcc 
ctcatagctc 
gtgtgcacga 
agtccaaccc 
gcagagcgag 
acactagaag 
gagttggtag 
gcaagcagca 

cggggtctga 

caaaaaggat 
gtatatatga 
cagcgatctg 
cgatacggga 
caccggctcc 
gtcctgcaac 
gtagttcgcc 
cacgctcgtc 
catgatcccc 
gaagtaagtt 
ctgtcatgcc 
gagaatagtg 
cgccacatag 
tctcaaggat 
gatcttcagc 
atgccgcaaa 
ttcaatatta 
gtatttagaa 
aattgtaagc 
ttttaaccaa 
agggttgagt 
cgtcaaaggg 
atcaagtttt 
ccgatttaga 
gaaaggagcg 
acccgccgcg 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
tggc 



gccattctgt 
cgcggggaag 
tccgcacctg 
gttgcggaac 
cttctgtggg 
gaacgggcga 
gaagaccgcg 
caggcagcgc 
gccgaagccc 
accggcccta 
ctcttcgtag 
atcgagaagc 
gcccaggacg 
ggtacccagc 
atagctgttt 
aagcataaag 
gcgctcactg 
ccaacgcgcg 
ctcgctgcgc 
acggttatcc 
aaaggccagg 
tgacgagcat 
aagataccag 
gcttaccgga 
acgctgtagg 
accccccgtt 
ggtaagacac 
gtatgtaggc 
gacagtattt 
ctcttgatcc 
gattacgcgc 
cgctcagtgg 
cttcacctag 
gtaaacttgg 
tctatttcgt 
gggcttacca 
agatttatca 
tttatccgcc 
agttaatagt 
gtttggtatg 
catgttgtgc 
ggccgcagtg 
atccgtaaga 
tatgcggcga 
cagaacttta 
cttaccgctg 
atcttttact 
aaagggaata 
ttgaagcatt 
aaataaacaa 
gttaatattt 
taggccgaaa 
gttgttccag 
cgaaaaaccg 
ttggggtcga 
gcttgacggg 
ggcgctaggg 
cttaatgcgc 
gggcgatcgg 
aggcgattaa 
agtgaattgt 



73 

ccgccatgga 
aatcgatcaa 
ggcagcacga 
gcatcttcaa 
aagccgcccg 
accttgttgc 
cggcaggcgc 
tgacgctccg 
cgaagcttcc 
agtcgtggtg 
acaagatcgt 
gcgcttcgat 
gcacggaaga 
ttttgttccc 
cctgtgtgaa 
tgtaaagcct 
cccgctttcc 
gggagaggcg 
tcggtcgttc 
acagaatcag 
aaccgtaaaa 
cacaaaaatc 
gcgtttcccc 
tacctgtccg 
tatctcagtt 
cagcccgacc 
gacttatcgc 
ggtgctacag 
ggtatctgcg 
ggcaaacaaa 
agaaaaaaag 
aacgaaaact 
atccttttaa 
tctgacagtfc 
tcatccatag 
tctggcccca 
gcaataaacc 
tccatccagt 
ttgcgcaacg 
gcttcattca 
aaaaaagcgg 
ttatcactca 
tgcttttctg 
ccgagttgct 
aaagtgctca 
ttgagatcca 
ttcaccagcg 
agggcgacac 
tatcagggtt 
ataggggttc 
tgttaaaatt 
tcggcaaaat 
tttggaacaa 
tctatcaggg 
ggtgccgtaa 
gaaagccggc 
cgctggcaag 
cgctacaggg 
tgcgggcctc 
gttgggtaac 
aatacgactc 



caagctgtac 
ggactcttac 
aggcacgtgc 
caagatcagg 
acgcttcggc 
ggagcgcgcc 
gtacgacgga 
gcagcaaggg 
ccttgaccaa 
ggggcgcgcg 
tgtcacgaag 
cacgtgggcg 
cgtagcggcg 
tttagtgagg 
attgttatcc 
ggggtgccta 
agtcgggaaa 
gtttgcgtat 
ggctgcggcg 
gggataacgc 
aggccgcgtt 
gacgctcaag 
ctggaagctc 
cctttctccc 
cggtgtaggt 
gctgcgcctt 
cactggcagc 
agttcttgaa 
ctctgctgaa 
ccaccgctgg 
gatctcaaga 
cacgttaagg 
attaaaaatg 
accaatgctt 
ttgcctgact 
gtgctgcaat 
agccagccgg 
ctattaattg 
ttgttgccat 
gctccggttc 
ttagctcctt 
tggttatggc 
tgactggtga 
cttgcccggc 
tcattggaaa 
gttcgatgta 
tttctgggtg 
ggaaatgttg 
attgtctcat 
cgcgcacatt 
cgcgttaaat 
cccttataaa 
gagtccacta 
cgatggccca 
agcactaaat 
gaacgtggcg 
tgtagcggtc 
cgcgtcccat 
ttcgctatta 
gccagggttt 
actatagggc 



tgcgagtgtg 
cgctgccgtc 
aacgtcagca 
cacgccgaag 
aagctcactg 
gacgccctga 
cccgttggca 
gcggaagagc 
tggttccccg 
tcagtagacg 
tcgactacgg 
aagccgccga 
taggcggcgc 
gttaatttcg 
gctcacaatt 
atgagtgagc 
cctgtcgtgc 
tgggcgctct 
agcggtatca 
aggaaagaac 
gctggcgttt 
tcagaggtgg 
cctcgtgcgc 
ttcgggaagc 
cgttcgctcc 
atccggtaac 
agccactggt 
gtggtggcct 
gccagttacc 
tagcggtggt 
agatcctttg 
gattttggtc 
aagttttaaa 
aatcagtgag 
ccccgtcgtg 
gataccgcga 
aagggccgag 
ttgccgggaa 
tgctacaggc 
ccaacgatca 
cggtcctccg 
agcactgcat 
gtactcaacc 
gtcaatacgg 
acgttcttcg 
acccactcgt 
agcaaaaaca 
aatactcata 
gagcggatac 
tccccgaaaa 
t'tttgttaaa 
tcaaaagaat 
ttaaagaacg 
ctacgtgaac 
cggaacccta 
agaaaggaag 
acgctgcgcg 
tcgccattca 
cgccagctgg 
tcccagtcac 
gaattggagc 



1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4 620 
4680 
4740 
4754 



<210> 81 
<211> 4773 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: vector 
5 P RK63pbsC31NLS-Re 

<400> 81 

ggccgcacca tgcccaagaa gaagaggaag gtgacacaag gggttgtgac cggggtggac 60 
acgtacgcgg gtgcttacga ccgtcagtcg cgcgagcgcg agaattcgag cgcagcaagc 120 

10 ccagcgacac agcgtagcgc caacgaagac aaggcggccg accttcagcg cgaagtcgag 180 
cgcgacgggg gccggttcag gttcgtcggg catttcagcg aagcgccggg cacgtcggcg 240 
ttcgggacgg cggagcgccc ggagttcgaa cgcatcctga acgaatgccg cgccgggcgg 300 
ctcaacatga tcattgtcta tgacgtgtcg cgcttctcgc gcctgaaggt catggacgcg 360 
attccgattg tctcggaatt gctcgccctg ggcgtgacga ttgtttccac tcaggaaggc 420 

15 gtcttccggc agggaaacgt catggacctg attcacctga ttatgcggct cgacgcgtcg 4 80 
cacaaagaat cttcgctgaa gtcggcgaag attctcgaca cgaagaacct tcagcgcgaa 540 
ttgggcgggt acgtcggcgg gaaggcgcct tacggcttcg agcttgtttc ggagacgaag 600 
gagatcacgc gcaacggccg aatggtcaat gtcgtcatca acaagcttgc gcactcgacc 660 
actcccctta ccggaccctt cgagttcgag cccgacgtaa tccggtggtg gtggcgtgag 720 

20 atcaagacgc acaaacacct tcccttcaag ccgggcagtc aagccgccat tcacccgggc 780 
agcatcacgg ggctttgtaa gcgcatggac gctgacgccg tgccgacccg gggcgagacg 840 
attgggaaga agaccgcttc aagcgcctgg gacccggcaa ccgttatgcg aatccttcgg 900 
gacccgcgta ttgcgggctt cgccgctgag gtgatctaca agaagaagcc ggacggcacg 960 
. ccgaccacga agattgaggg ttaccgcatt cagcgcgacc cgatcacgct ccggccggtc 1020 

25 gagcttgatt gcggaccgat catcgagccc gctgagtggt atgagcttca ggcgtggttg 1080 
gacggcaggg ggcgcggcaa ggggctttcc cgggggcaag ccattctgtc cgccatggac 1140 
aagctgtact gcgagtgtgg cgccgtcatg acttcgaagc gcggggaaga atcgatcaag 1200 
gactcttacc gctgccgtcg ccggaaggtg gtcgacccgt ccgcacctgg gcagcacgaa 12 60 
ggcacgtgca acgtcagcat ggcggcactc gacaagttcg ttgcggaacg catcttcaac 1320 

30 aagatcaggc acgccgaagg cgacgaagag acgttggcgc ttctgtggga agccgcccga 1380 
cgcttcggca agctcactga ggcgcctgag aagagcggcg aacgggcgaa ccttgttgcg 1440 
gagcgcgccg acgccctgaa cgcccttgaa gagctgtacg aagaccgcgc ggcaggcgcg 1500 
tacgacggac ccgttggcag gaagcacttc cggaagcaac aggcagcgct gacgctccgg 1560 
cagcaagggg cggaagagcg gcttgccgaa cttgaagccg ccgaagcccc gaagcttccc 1620 

35 cttgaccaat ggttccccga agacgccgac gctgacccga ccggccctaa gtcgtggtgg 1680 
gggcgcgcgt cagtagacga caagcgcgtg ttcgtcgggc tcttcgtaga caagatcgtt 1740 
gtcacgaagt cgactacggg cagggggcag ggaacgccca tcgagaagcg cgcttcgatc 1800 
acgtgggcga agccgccgac cgacgacgac gaagacgacg cccaggacgg cacggaagac I8 60 
gtagcggcgt aggcggcgcc cgggctcgag ggggggcccg gtacccagct tttgttccct 1920 

40 ttagtgaggg ttaatttcga gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa 1980 
ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt gtaaagcctg 2040 
gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc ccgctttcca 2100 
gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg ggagaggcgg 2160 
tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 2220 

45 gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 2280 
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 2340 
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 24 00 
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 2460 
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 2520 

50 ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 2580 
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 2640 
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 2700 
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 27 60 
gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc 2820 

55 tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 2880 
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 2940 
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 3000 
acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 3060 
. ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 3120 

60 ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt 3180 
tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag 3240 
tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag caataaacca 3300 
gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc 3360 
tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt 3420 

65 tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag 3480 
ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt 3540 
tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt tatcactcat 3600 
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ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt 3660 
gactggtgag tactcaacca agtcattotg agaatagtgt atgcggcgac cgagttgctc 3720 
ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa aagtgctcat 37 BO 
cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag 3840 
ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt 3900 
ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg 3960 
gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta 4020 
ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc 4 080 
gcgcacattt ccccgaaaag tgccacctaa attgtaagcg ttaatatttt gttaaaattc 4140 
gcgttaaatt tttgttaaat cagctcattt tttaaccaat aggccgaaat cggcaaaatc 4200 
ccttataaat caaaagaata gaccgagata gggttgagtg ttgttccagt ttggaacaag 4260 
agtccactat taaagaacgt ggactccaac gtcaaagggc gaaaaaccgt ctatcagggc 4320 
gatggcccac tacgtgaacc atcaccctaa tcaagttttt tggggtcgag gtgccgtaaa 4380 
gcactaaatc ggaaccctaa agggagcccc cgatttagag cttgacgggg aaagccggcg 4440 
aacgtggcga gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt 4500 
gtagcggtca cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc 4560 
gcgtcccatt cgccattcag gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct 4 620 
tcgctattac gccagctggc gaaaggggcfa tgtgctgcaa ggcgattaag ttgggtaacg 4680 
ccagggtttt cccagtcacg acgttgtaaa acgacggcca gtgaattgta atacgactca 4 740 
ctatagggcg aattggagct ccaccgcggt ggc 4773 



<210> 82 
<211> 7803 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
pPGKattAl 



vector 



<400> 82 

tatcatgtct 

tataaagtgt 

cagccaggat 

ccgtatatta 

tcttgtatat 

tataccataa 

tcggatccat 

cgagtagctt 

cccaacttaa 

cccgcaccga 

ggtttccggc 

atactgtcgt 

acgtaaccta 

gttactcgct 

tttttgatgg 

gccaggacag 

accgcctcgc 

tgtggcggat 

tcagcgattt 

ctgaagttca 

agggtgaaac 

gtggtggtta 

gcgccgaaat 

tgattgaagc 

tgctgctgaa 

tgcatggtca 

acaactttaa 

gcgaccgcta 

caatgaatcg 

gaatggtgca 

caggccacgg 

gcccggtgca 

tgtacgcgcg 

aatggctttc 

tgggtaacag 



ggatccgcgt 
ctaacagttt 
agagcactgg 
ctttttgatt 
tggtaactct 
atagctaagt 
aacttcgtat 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
accagaagcg 
cgtcccctca 
tcccattacg 
cacatttaat 
cgttaactcg 
tcgtttgccg 
ggtgatggtg 
gagcggcatt 
ccatgttgcc 
gatgtgcggc 
gcaggtcgcc 
tgccgatcgc 
cccgaatctc 
agaagcctgc 
cggcaagccg 
ggtcatggat 
cgccgtgcgc 
cggcctgtat 
tctgaccgat 
gcgcgatcgt 
cgctaatcac 
gtatgaaggc 
cgtggatgaa 
gctacctgga 
tcttggcggt 



taacacctaa 
aaaatatccg 
cctccggagc 
cagattagat 
ctactctata 
ttgtcaaagt 
agcatacatt 
gtcgttttac 
gcacatcccc 
caacagttgc 
gtgccggaaa 
aactggcaga 
gtcaatccgc 
gttgatgaaa 
gcgtttcatc 
tctgaatttg 
Ctgcgttgga 
ttccgtgacg 
actcgcttta 
gagttgcgtg 
agcggcaccg 
gtcacactac 
tatcgtgcgg 
gatgtcggtt 
ttgctgattc 
gagcagacga 
tgttcgcatt 
gtggtggatg 
gatccgcgct 
aatcacccga 
gacgcgctgt 
ggcggagccg 
gaccagccct 
gagacgcgcc 
ttcgctaaat 



gaaggcgaag 
atgaggcata 
cggaggtccc 
ttgtaaatct 
atttttatga 
tcttattaaa 
atacgaagtt 
aacgtcgtga. 
ctttcgccag 
gcagcctgaa 
gctggctgga 
tgcacggtta 
cgtttgttcc 
gctggctaca 
tgtggtgcaa 
acctgagcgc 
gtgacggcag 
tctcgttgct 
atgatgattt 
actacctacg 
cgcctttcgg 
gtctgaacgt 
tggttgaact 
tccgcgaggt 
gaggcgttaa 
tggtgcagga 
atccgaacca 
aagccaatat 
ggctaccggc 
gtgtgatcat 
atcgctggat 
acaccacggc 
tcccggctgt 
cgctgatcct 
actggcaggc 



ttttccttac 
tttatgttgg 
gggttcaaat 
ttattacaag 
gaaattcaca 
ctctccatgt 
ataccgggcc 
ctgggaaaac 
ctggcgtaat 
tggcgaatgg 
gtgcgatctt 
cgatgcgccc 
cacggagaat 
ggaaggccag 
cgggcgctgg 
atttttacgc 
ttatctggaa 
gcataaaccg 
cagccgcgct 
ggtaacagtt 
cggtgaaatt 
cgaaaacccg 
gcacaccgcc 
gcggattgaa 
ccgtcacgag 
tatcctgctg 
tccgctgtgg 
tgaaacccac 
gatgagcgaa 
ctggtcgctg 
caaatctgtc 
caccgatatt 
gccgaaatgg 
ttgcgaatac 
gtttcgtcag 



accttgcaga 
acccgtagct 
cccggcgggt 
gataatttga 
gtcgtccctt 
agagattaaa 
accatggtcg 
cctggcgtta 
agcgaagagg 
cgctttgcct 
cctgaggccg 
at eta caeca 
ccgacgggtt 
aegegaatta 
gtcggttacg 
geeggagaaa 
gatcaggata 
actacacaaa 
gtactggagg 
tctttatggc 
atcgatgagc 
aaactgtgga 
gacggcacgc 
aatggtctgc 
catcatcctc 
atgaagcaga 
tacaegctgt 
ggcatggtgc 
cgcgtaacgc 
gggaatgaat 
gatccttccc 
atttgecega 
tccatcaaaa 
gcccacgcga 
tatccccgtt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 
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tacagggcgg cttcgtctgg gactgggtgg 
gcaacccgtg gtcggcttac ggcggtgatt 
gtatgaacgg tctggtcttt gccgaccgca 
accagcagca gtttttccag ttccgtttat 
5 acctgttccg tcatagcgat aacgagctcc 
cgctggcaag cggtgaagtg cctctggatg 
tgcctgaact accgcagccg gagagcgccg 
aaccgaacgc gaccgcatgg tcagaagccg 
tggcggaaaa cctcagtgtg acgctccccg 

10 ccagcgaaat ggatttttgc atcgagctgg 
caggctttct ttcacagatg tggattggcg 
atcagttcac ccgtgcaccg ctggataacg 
accctaacgc ctgggtcgaa cgctggaagg 
tgttgcagtg cacggcagat acacttgctg 

15 ggcagcatca ggggaaaacc fctatttatca 
gtcaaatggc gattaccgtt gatgttgaag 
ttggcctgaa ctgccagctg gcgcaggtag 
cgcaagaaaa ctatcccgac cgccttactg 
tgtcagacat gtataccccg tacgtcttcc 

20 gcgaattgaa ttatggccca caccagtggc 
acagtcaaca gcaactgatg gaaaccagcc 
catggctgaa tatcgacggt ttccatatgg 
cagtatcggc ggaattccag ctgagcgccg 
aaaaataata ataaccgggc aggggggatc 

25 acataattgg acaaactacc tacagagatt 
agtgtataat gtgttaaact actgattcta 
ggaactgatg aatgggagca gtggtggaat 
tgagtttgga caaaccacaa ctagaatgca 
tgatgctatt gctttatttg taaccattat 

30 ttgcattcat tttatgtttc aggttcaggg 
aaacctctac aaatgtggta tggctgatta 
cctattttta taggttaatg tcatgataat 
tcggggaaat gtgcgcggaa cccctatttg 
tccgctcatg agacaataac cctgataaat 

35 gagtattcaa catttccgtg tcgcccttat 
ttttgctcac ccagaaacgc tggtgaaagt 
agtgggttac atcgaactgg atctcaacag 
agaacgtttt ccaatgatga gcacttttaa 
tattgacgcc gggcaagagc aactcggtcg 

40 tgagtactca ccagfccacag aaaagcatct 
cagtgctgcc ataaccatga gtgataacac 
aggaccgaag gagctaaccg cttttttgca 
tcgttgggaa ccggagctga atgaagccat 
tgtagcaatg gcaacaacgt tgcgcaaact 

45 ccggcaacaa ttaatagact ggatggaggc 
ggcccttccg gctggctggt ttattgctga 
cggtatcatt gcagcactgg ggccagatgg 
gacggggagt caggcaacta tggatgaacg 
actgattaag cattggtaac tgtcagacca 

50 aaaacttcat ttttaattta aaaggatcta 
caaaatccct taacgtgagt tttcgttcca 
aggatcttct tgagatcctt tttttctgcg 
accgctacca gcggtggttt gtttgccgga 
aactggcttc agcagagcgc agataccaaa 

55 ccaccacttc aagaactctg tagcaccgcc 
agtggctgct gccagtggcg ataagtcgtg 
accggataag gcgcagcggt cgggctgaac 
gcgaacgacc tacaccgaac tgagatacct 
tcccgaaggg agaaaggcgg acaggtatcc 

60 cacgagggag cttccagggg gaaacgcctg 
cctctgactt gagcgtcgat ttttgtgatg 
cgccagcaac gcggcctttt tacggttcct 
ctttcctgcg ttatcccctg attctgtgga 
taccgctcgc cgcagccgaa cgaccgagcg 

65 gcgcccaata cgcaaaccgc ctctccccgc 
cgacaggttt cccgactgga aagcgggcag 
cactcattag gcaccccagg ctttacactt 
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atcagtcgct gattaaatat gatgaaaacg 2160 
ttggcgatac gccgaacgat cgccagttct 2220 
cgccgcatcc agcgctgacg gaagcaaaac 2280 
ccgggcaaac catcgaagtg accagcgaat 2J340 
tgcactggat ggtggcgctg gatggtaagc 2400 
tcgctccaca aggtaaacag ttgattgaac 24 60 
ggcaactctg gctcacagta cgcgtagtgc 2520 
ggcacatcag cgcctggcag cagtggcgtc 2580 
ccgcgtccca cgccatcccg catctgacca 2640 
gtaataagcg ttggcaattt aaccgccagt 2700 
ataaaaaaca actgctgacg ccgctgcgcg 2760 
acattggcgt aagtgaagcg acccgcattg 2820 
cggcgggcca ttaccaggcc gaagcagcgt 2880 
atgcggtgct gattacgacc gctcacgcgt 2940 
gccggaaaac ctaccggatt gatggtagtg 3000 
tggcgagcga tacaccgcat ccggcgcgga 3060 
cagagcgggt aaactggctc ggattagggc 3120 
ccgcctgttt tgaccgctgg gatctgccat 3180 
cgagcgaaaa cggtctgcgc tgcgggacgc 324 0 
gcggcgactt ccagttcaac atcagccgct 3300 
atcgccatct gctgcacgcg gaagaaggca 3360 
ggattggtgg cgacgactcc tggagcccgt 3420 
gtcgctacca ttaccagttg gtctggtgtc 3480 
tttgtgaagg aaccttactt ctgtggtgtg 3540 
taaagctcta aggtaaatat aaaattttta' 3600 
attgtttgtg tattttagat tccaacatat 3660 
gccagatcca gacatgataa gatacattga 3720 
gtgaaaaaaa tgctttattt gtgaaatttg 3780 
aagctgcaat aaacaagtta acaacaacaa 3840 
ggaggtgtgg gaggtttttt aaagcaagta 3900 
tgatctgcgg ccgcagggcc tcgtgatacg 3960 
aatggtttct tagacgtcag gtggcacttt 4020 
tttatttttc taaatacatt caaatatgta 4080 
gcttcaataa tattgaaaaa ggaagagtat 4140 
tccctttttt gcggcatttt gccttcctgt 4200 
aaaagatgct gaagatcagt tgggtgcacg 4260 
cggtaagatc cttgagagtt ttcgccccga 4320 
agttctgcta tgtggcgcgg tattatcccg 4380 
ccgcatacac tattctcaga atgacttggt 4440 
tacggatggc atgacagtaa gagaattatg 4500 
tgcggccaac ttacttctga caacgatcgg 4560 
caacatgggg gatcatgtaa ctcgccttga 4 620 
accaaacgac gagcgtgaca ccacgatgcc 4 68 0 
attaactggc gaactactta ctctagcttc 4740 
ggataaagtt gcaggaccac ttctgcgctc 4800 
taaatctgga gccggtgagc gtgggtctcg 48 60 
taagccctcc cgtatcgtag ttatctacac 4920 
aaatagacag atcgctgaga taggtgcctc 4980 
agtttactca tatatacttt agattgattt 5040 
ggtgaagatc ctttttgata atctcatgac 5100 
ctgagcgtca gaccccgtag aaaagatcaa 5160 
cgtaatctgc tgcttgcaaa caaaaaaacc 5220 
tcaagagcta ccaactcttt ttccgaaggt 5280 
tactgtcctt ctagtgtagc cgtagttagg 534 0 
tacatacctc gctctgctaa tcctgttacc 5400 
tcttaccggg ttggactcaa gacgatagtt 5460 
ggggggttcg tgcacacagc ccagcttgga 5520 
acagcgtgag ctatgagaaa gcgccacgct 5580 
ggtaagcggc agggtcggaa caggagagcg 564 0 
gtatctttat agtcctgtcg ggtttcgcca 5700 
ctcgtcaggg gggcggagcc tatggaaaaa 57 60 
ggccttttgc tggccttttg ctcacatgtt 5820 
taaccgtatt accgcctttg agtgagctga 5880 
cagcgagtca gtgagcgagg aagcggaaga 5940 
gcgttggccg attcattaat gcagctggca 6000 
tgagcgcaac gcaattaatg tgagttagct 6060 
tatgcttccg gctcgtatgt tgtgtggaat 6120 
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tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagctggc 6180 

gcgtcgagga attctaccgg gtaggggagg cgcttttccc aaggcagtct ggagcatgcg 6240 

ctttagcagc cccgctgggc acttggcgct acacaagtgg cctctggctc gcacacattc 6300 

cacatccacc ggtaggcgcc aaccggctcc gttctttggt ggccccttcg cgccaccttc 6360 

5 tactcctccc ctagtcagga agttcccccc cgccccgcag ctcgcgtcgt gcaggacgtg 6420 

acaaatggaa gtagcacgtc tcactagtct cgtgcagatg gacagcaccg ctgagcaatg 6480 

gaagcgggta ggcctttggg gcagcggcca atagcagctt tgctccttcg ctttctgggc 6540 

tcagaggctg ggaaggggtg ggtccggggg cgggctcagg ggcgggctca ggggcggggc 6600 

gggcgcccga aggtcctccg gaggcccggc attctgcacg cttcaaaagc gcacgtctgc 6660 

10 cgcgctgttc tcctcttcct catctccggg cctttcgacc tgcagcccgg tacagttcga 6720 

ataacttcgt atagcataca ttatacgaag ttataagctt gcatgcctgc aggtcggccg 6780 

ccacgaccgg ccggccggtg ccgccaccat cccctgaccc acgcccctga cccctcacaa 6840 

ggagacgacc ttccatgacc gagtacaagc ccacggtgcg cctcgccacc cgcgacgacg 6900 

tcccccgggc cgtacgcacc ctcgccgccg cgttcgccga ctaccccgcc acgcgccaca 6960 

15 ccgtcgaccc ggaccgccac atcgagcggg tcaccgagct gcaagaactc ttcctcacgc 7020 

gcgtcgggct cgacatcggc aaggtgtggg tcgcggacga cggcgccgcg gtggcggtct 7080 

ggaccacgcc ggagagcgtc, gaagcggggg cggtgttcgc cgagatcggc ccgcgcatgg 7140 

ccgagttgag cggttcccgg ctggccgcgc agcaacagat ggaaggcctc ctggcgccgc 7200 

accggcccaa ggagcccgcg tggttcctgg ccaccgtcgg cgtctcgccc gaccaccagg 7260 

20 gcaagggtct gggcagcgcc gtcgtgctcc ccggagtgga ggcggccgag cgcgccgggg 7320 

tgcccgcctt cctggagacc tccgcgcccc gcaacctccc cttctacgag cggctcggct 7380 

tcaccgtcac cgccgacgtc gagtgcccga aggaccgcgc gacctggtgc atgacccgca 74 40 

agcccggtgc ctgacgcccg ccccacgacc cgcagcgccc gaccgaaagg agcgcacgac 7500 

cccatggctc cgaccgaagc cgacccgggc ggccccgccg accccgcacc cgcccccgag 7560 

25 gcccaccgac tctagaggat cataatcagc cataccacat ttgtagaggt tttacttgct 7 620 ■ 

ttaaaaaacc tcccacacct ccccctgaac ctgaaacata aaatgaatgc aattgttgtt 7680 

gttaacttgt ttattgcagc ttataatggt tacaaataaa gcaatagcat cacaaatttc 7740 

acaaataaag catttttttc actgcattct agttgtggtt tgfcccaaact catcaatgta 7800 

tct 7803 

30 

<210> 83 
<211> 8167 
<212> DNA 
35 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector 
pPGKattA2 

40 

<400> 83- 

tatcatgtct ggatccgcgt 
tataaagtgt ctaacagttt 
cagccaggat agagcactgg 
45 ccgtatatta ctttttgatt 
tcttgtatat tggtaactct 
tataccataa atagctaagt 
tcggatccat aacttcgtat 
cgagtagctt ggcactggcc 
50 cccaacttaa tcgccttgca 
cccgcaccga tcgcccttcc 
ggtttccggc accagaagcg 
atactgtcgt cgtcccctca 
acgtaaccta tcccattacg 
55 gttactcgct cacatttaat 
tttttgatgg cgttaactcg 
gccaggacag tcgtttgccg 
accgcctcgc ggtgatggtg 
tgtggcggat gagcggcatt 
60 tcagcgattt ccatgttgcc 
ctgaagttca gatgtgcggc 
agggtgaaac gcaggtcgcc 
gtggtggtta tgccgatcgc 
gcgccgaaat cccgaatctc 
65 tgattgaagc agaagcctgc 
tgctgctgaa cggcaagccg 
tgcatggtca ggtcatggat 



taacacctaa 
aaaatatccg 
cctccggagc 
cagattagat 
ctactctata 
ttgtcaaagt 
agcatacatt 
gtcgttttac 
gcacatcccc 
caacagttgc 
gtgccggaaa 
aactggcaga 
gtcaatccgc 
gttgatgaaa 
gcgtttcatc 
tctgaatttg 
ctgcgttgga 
ttccgtgacg 
actcgcttta 
gagttgcgtg 
agcggcaccg 
gtcacactac 
tatcgtgcgg 
gatgtcggtt 
ttgctgattc 
gagcagacga 



gaaggcgaag 
at gag gc at a 
cggaggtccc 
ttgtaaatct 
atttttatga 
tcttattaaa 
atacgaagtt 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
gctggctgga 
tgcacggtta 
cgtttgttcc 
gctggctaca 
tgtggtgcaa 
acctgagcgc 
gtgacggcag 
tctcgttgct 
atgatgattt 
actacctacg 
cgcctttcgg 
gtctgaacgt 
tggttgaact 
tccgcgaggt 
gaggcgttaa 
tggtgcagga 



ttttccttac 
tttatgttgg 
gggttcaaat 
ttattacaag 
gaaattcaca 
ctctccatgt 
ataccgggcc 
ctgggaaaac 
ctggcgtaat 
tggcgaatgg 
gtgcgatctt 
cgatgcgccc 
cacggagaat 
ggaaggccag 
cgggcgctgg 
atttttacgc 
ttatctggaa 
gcataaaccg 
cagccgcgct 
ggtaacagtt 
cggtgaaatt 
cgaaaacccg 
gcacaccgcc 
gcggattgaa 
ccgtcacgag 
tatcctgctg 



accttgcaga 
acccgtagct 
cccggcgggt 
gataatttga 
gtcgtccctt 
agagattaaa 
accatggtcg 
cctggcgtta 
agcgaagagg 
cgctttgcct 
cctgaggccg 
atctacacca 
ccgacgggtt 
acgcgaatta 
gtcggttacg 
gccggagaaa 
gatcaggata 
actacacaaa 
gtactggagg 
tctttatggc 
atcgatgagc 
aaactgtgga 
gacggcacgc 
aatggtctgc 
catcatcctc 
atgaagcaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

12 60 

1320 

1380 

1440 

1500 

1560 
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acaactttaa cgccgtgcgc tgttcgcatt 
gcgaccgcta cggcctgtat gtggtggatg 
caatgaatcg tctgaccgat gatccgcgct 
gaatggtgca gcgcgatcgt aatcacccga 
5 caggccacgg cgctaatcac gacgcgctgt 
gcccggtgca gtatgaaggc ggcggagccg 
tgtacgcgcg cgtggatgaa gaccagccct 
aatggctttc gctacctgga gagacgcgcc 
tgggtaacag tcttggcggt ttcgctaaat 

10 tacagggcgg cttcgtctgg gactgggtgg 
gcaacccgtg gtcggcttac ggcggtgatt 
gtatgaacgg tctggtcttt gccgaccgca 
accagcagca gtttttccag ttccgtttat 
acctgttccg tcatagcgat aacgagctcc 

15 cgctggcaag cggtgaagtg cctctggatg 
tgcctgaact accgcagccg gagagcgccg 
aaccgaacgc gaccgcatgg tcagaagccg 
tggcggaaaa cctcagtgtg acgctccccg 
ccagcgaaat ggatttttgc atcgagctgg 

20 caggctttct ttcacagatg tggattggcg 
atcagttcac ccgtgcaccg ctggataacg 
accctaacgc ctgggtcgaa cgctggaagg 
tgttgcagtg cacggcagat acacttgctg 
ggcagcatca ggggaaaacc ttatttatca 

25 gtcaaatggc gattaccgtt gatgttgaag 
ttggcctgaa ctgccagctg gcgcaggtag. 
cgcaagaaaa ctatcccgac cgccttactg 
tgtcagacat gtataccccg tacgtcttcc 
gcgaattgaa ttatggccca caccagtggc 

30 acagtcaaca gcaactgatg gaaaccagcc 
catggctgaa tatcgacggt ttccatatgg 
cagtatcggc ggaattccag ctgagcgccg 
aaaaataata ataaccgggc aggggggatc 
acataattgg acaaactacc tacagagatt 

35 agtgtataat gtgttaaact actgattcta 
ggaactgatg aatgggagca gtggtggaat 
tgagtttgga caaaccacaa ctagaatgca 
tgatgctatt gctttatttg taaccattat 
ttgcattcat tttatgtttc aggttcaggg 

40 aaacctctac aaatgtggta tggctgatta 
cctattttta taggttaatg tcatgataat 
tcggggaaat gtgcgcggaa cccctatttg 
tccgctcatg agacaataac cctgataaat 
gagtattcaa catttccgtg tcgcccttat 

45 ttttgctcac ccagaaacgc tggtgaaagt 
agtgggttac atcgaactgg atctcaacag 
agaacgtttt ccaatgatga gcacttttaa 
tattgacgcc gggcaagagc aactcggtcg 
tgagtactca ccagtcacag aaaagcatct 

50 cagtgctgcc ataaccatga gtgataacac 
aggaccgaag gagctaaccg cttttttgca 
tcgttgggaa ccggagctga atgaagccat 
tgtagcaatg gcaacaacgt tgcgcaaact 
ccggcaacaa ttaatagact ggatggaggc 

55 ggcccttccg gctggctggt ttattgctga 
cggtatcatt gcagcactgg ggccagatgg 
gacggggagt caggcaacta tggatgaacg 
actgattaag cattggtaac tgtcagacca 
aaaacttcat ttttaattta aaaggatcta 

60 caaaatccct taacgtgagt tttcgttcca 
aggatcttct tgagatcctt tttttctgcg 
accgctacca gcggtggttt gtttgccgga 
aactggcttc agcagagcgc agataccaaa 
ccaccacttc aagaactctg tagcaccgcc 

65 agtggctgct gccagtggcg ataagtcgtg 
accggataag gcgcagcggt cgggctgaac 
gcgaacgacc tacaccgaac tgagatacct 
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atccgaacca tccgctgtgg tacacgctgt 1620 
aagccaatat tgaaacccac ggcatggtgc 1680 
ggctaccggc gatgagcgaa cgcgtaacgc 1740 
gtgtgatcat ctggtcgctg gggaatgaat 1800 
atcgctggat caaatctgtc gatccttccc 1860 
acaccacggc caccgatatt atttgcccga 1920 
tcccggctgt gccgaaatgg tccatcaaaa 1980 
cgctgatcct ttgcgaatac gcccacgcga 2040 
actggcaggc gtttcgtcag tatccccgtt 2100 
atcagtcgct gattaaatat gatgaaaacg 2160 
ttggcgatac gccgaacgat cgccagttct 2220 
cgccgcatcc agcgctgacg gaagcaaaac 2280 
ccgggcaaac catcgaagtg accagcgaat 2340 
tgcactggat ggtggcgctg gatggtaagc 2400 
tcgctccaca aggtaaacag ttgattgaac 24 60 
ggcaactctg gctcacagta cgcgtagtgc 2520 
ggcacatcag cgcctggcag cagtggcgtc 2580 
ccgcgtccca cgccatcccg catctgacca 2640 
gtaataagcg ttggcaattt aaccgccagt 2700 
ataaaaaaca actgctgacg ccgctgcgcg 2760 
acattggcgt aagtgaagcg acccgcattg 2820 
cggcgggcca ttaccaggcc gaagcagcgt 2880 
atgcggtgct gattacgacc gctcacgcgt 2940 
gccggaaaac ctaccggat't gatggtagtg 3000 
tggcgagcga tacaccgcat ccggcgcgga 3060 
cagagcgggt aaactggctc ggattagggc 3120 
ccgcctgttt tgaccgctgg gatctgccat 3180 
cgagcgaaaa cggtctgcgc tgcgggacgc 3240 
gcggcgactt ccagttcaac atcagccgct 3300 
atcgccatct gctgcacgcg gaagaaggca 3360 
ggattggtgg cgacgactcc tggagcccgt 3420 
gtcgctacca ttaccagttg gtctggtgtc 3480 
tttgtgaagg aaccttactt ctgtggtgtg 3540 
taaagctcta aggtaaatat aaaattttta 3600 
attgtttgtg tattttagat tccaacctat 3660 
gccagatcca gacatgataa gatacattga 3720 
gtgaaaaaaa tgctttattt gtgaaatttg 37 80 
aagctgcaat aaacaagtta acaacaacaa 3840 
ggaggtgtgg gaggtttttt aaagcaagta 3900 
tgatctgcgg ccgcagggcc tcgtgatacg 3960 
aatggtttct tagacgtcag gtggcacttt 4020 
tttatttttc taaatacatt caaatatgta 4080 
gcttcaataa tattgaaaaa ggaagagtat 414 0 
tccctttttt gcggcatttt gccttcctgt 4200 
aaaagatgct gaagatcagt tgggtgcacg 4260 
cggtaagatc cttgagagtt ttcgccccga 4320 
agttctgcta tgtggcgcgg tattatcccg 4380 
ccgcatacac tattctcaga atgacttggt 4 440 
tacggatggc atgacagtaa gagaattatg 4500 
tgcggccaac ttacttctga caacgatcgg 4 560 
caacatgggg gatcatgtaa ctcgccttga 4 620 
accaaacgac gagcgtgaca ccacgatgcc 4 680 
attaactggc gaactactta ctctagcttc 4740 
ggataaagtt gcaggaccac ttctgcgctc 4800 
taaatctgga gccggtgagc gtgggtctcg 4860 
taagccctcc cgtatcgtag ttatctacac 4920 
aaatagacag atcgctgaga taggtgcctc 4980 
agtttactca tatatacttt agattgattt 504 0 
ggtgaagatc ctttttgata atctcatgac 5100 
ctgagcgtca gaccccgtag aaaagatcaa 5160 
cgtaatctgc tgcttgcaaa caaaaaaacc 5220 
tcaagagcta ccaactcttt ttccgaaggt 5280 
tactgtcctt ctagtgtagc cgtagttagg 5340 
tacatacctc gctctgctaa tcctgttacc 5400 
tcttaccggg ttggactcaa gacgatagtt 54 60 
ggggggttcg tgcacacagc ccagcttgga 5520 
acagcgtgag ctatgagaaa gcgccacgct 5580 
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tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 5640 
cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 5700 
cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 5760 
cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 5820 
5 ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga 5880 
taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 5940. 
gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg attcattaat gcagctggca 6000 
cgacaggttt cccgactgga aagcgggcag tgagcgcaac gcaattaatg tgagttagct 6060 
cactcattag gcaccccagg ctttacactt tatgcttccg gctcgtatgt tgtgtggaat 6120 

10 tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg ccaagctggc 6180 
gcgtcgagga attctaccgg gtaggggagg cgcttttccc aaggcagtct ggagcatgcg 6240 
ctttagcagc cccgctgggc acttggcgct acacaagtgg cctctggctc gcacacattc 6300 
cacatccacc ggtaggcgcc aaccggctcc gttctttggt ggccccttcg cgccaccttc 6360 
tactcctccc ctagtcagga agttcccccc cgccccgcag ctcgcgtcgt gcaggacgtg 6420 

15 acaaatggaa gtagcacgtc tcactagtct cgtgcagatg gacagcaccg ctgagcaatg 6480 
gaagcgggta ggcctttggg gcagcggcca atagcagctt tgctccttcg ctttctgggc 6540 
tcagaggctg ggaaggggtg ggtccggggg cgggctcagg ggcgggctca ggggcggggc 6600 
gggcgcccga aggtcctccg gaggcccggc attctgcacg cttcaaaagc gcacgtctgc 6660 
cgcgctgttc tcctcttcct catctccggg cctttcgacc tgcagcccgg tacagttcga 6720 

20 aggatccgcg ttaacaccta agaaggcgaa gttttcctta caccttgcag atataaagtg 6780 
tctaacagtt taaaatatcc gatgaggcat atttatgttg gacccgtagc tcagccagga 6840 
tagagcactg gcctccggag ccggaggtcc cgggttcaaa tcccggcggg tccgtatatt 6900 
actttttgat tcagattaga tttgtaaatc tttattacaa ggataatttg atcttgtata 6960 
ttggtaactc tctactctat aatttttatg agaaattcac agtcgtccct ttataccata 7020 

25 aatagctaag tttgtcaaag ttcttattaa actctccatg tagagattaa atcggatcct 7080- 
tcgaataact tcgtatagca tacattatac gaagttataa gcttgcatgc ctgcaggtcg 7140 
gccgccacga ccggccggcc ggtgccgcca ccatcccctg acccacgccc ctgacccctc 7200 
acaaggagac gaccttccat gaccgagtac aagcccacgg tgcgcctcgc cacccgcgac 7260 
gacgtccccc gggccgtacg caccctcgcc gccgcgttcg ccgactaccc cgccacgcgc 7320 

30 cacaccgtcg acccggaccg ccacatcgag cgggtcaccg agctgcaaga actcttcctc 7380 
acgcgcgtcg ggctcgacat cggcaaggtg tgggtcgcgg acgacggcgc cgcggtggcg 7440 
gtctggacca cgccggagag cgtcgaagcg ggggcggtgt tcgccgagat cggcccgcgc 7500 
atggccgagt tgagcggttc ccggctggcc gcgcagcaac agatggaagg cctcctggcg 7560 
ccgcaccggc ccaaggagcc cgcgtggttc ctggccaccg tcggcgtctc gcccgaccac 7620 

35 cagggcaagg gtctgggcag cgccgtcgtg ctccccggag tggaggcggc cgagcgcgcc 7680 
ggggtgcccg ccttcctgga gacctccgcg ccccgcaacc tccccttcta cgagcggctc 7740 
ggcttcaccg tcaccgccga cgtcgagtgc ccgaaggacc gcgcgacctg gtgcatgacc 7800 
cgcaagcccg gtgcctgacg cccgccccac gacccgcagc gcccgaccga aaggagcgca 78 60 
cgaccccatg gctccgaccg aagccgaccc gggcggcccc gccgaccccg cacccgcccc 7920 

40 cgaggcccac cgactctaga ggatcataat cagccatacc acatttgtag aggttttact 7980 
tgctttaaaa aacctcccac acctccccct gaacctgaaa cataaaatga atgcaattgt 8040 
tgttgttaac ttgtttattg cagcttataa tggttacaaa taaagcaata gcatcacaaa 8100 
tttcacaaat aaagcatttt tttcactgca ttctagttgt ggtttgtcca aactcatcaa 8160 
tgtatct 8167 

45 

<210> 84 
<211> 51 
<212> DNA 
50 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer XisAl 
55 <400> 84 

ataagaatgc ggccgcccga tatgcaaaat cagggtcaag acaaatatca a 51 



<210> 85 
60 <211> 76 
<212> DNA 

<213> Artificial Sequence 
<220> 

65 <223> Description of Artificial Sequence: primer XisA2 
<400> 85 
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ataagaatgc ggccgcacca tgcccaagaa gaagaggaag gtgcaaaatc agggtcaaga 60 
caaatatcaa caagcc - 76 



5 <210> 86 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
10 <220> 

<223> Description of Artificial Sequence: primer XisA3 
<400> 86 

ataagaatgc ggccgctcaa ctattcttat aagctatttc catc 44 



<210> 87 
<211> 82 
<212> DNA 
20 <213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer nifDl 
25 <400> 87 

cgatggctct tcccttccgt caaatgcact cttgggatta ctccgaacct agcgatgggg 60 
tgcaaatgtc agatcagata ag 82 



30 <210> 88 
<211> 82 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<2 23> Description of Artificial Sequence: primer nifD2 
<400> 88 

cgcttatctg atctgacatt tgcaccccat cgctaggttc ggagtaatcc caagagtgca 60 
40 tttgacggaa gggaagagcc at 82 



<210> 89 
<211> 74 
45 <212> DNA 

<213> Artificial Sequence 

<220> 

^ <223> Description of Artificial Sequence: primer nifD3 
<400> 89 

gatcagctgt tgaaagctat taaaccacaa aaaggattac tccggccctt atcacggtta 60 
cgacggattt gcta 74 

55 

<210> 90 
<211> 74 
<212> DNA 

<213> Artificial Sequence 

60 

<220> 

<223> Description of Artificial Sequence; primer nifD4 
<400> 90 

65 gatctagcaa atccgtcgta accgtgataa gggccggagt aatccttttt gtggtttaat 60 
agctttcaac agct 74 
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<210> 91 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primerSSVl-1 
<400> 91 

ataagaatgc ggccgcccga tatgacgaaa gataagacgc g 41 



<210> 92 
<211> 70 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer SSV1-2 
<400> 92 

ataagaatgc ggccgcacca tgcccaagaa gaagaggaag gtgacgaaag ataagacgcg 60 
ttataaatac 70 



<210> 93 
<211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer SSV2 
<400> 93 

tgtcccgggc tcgaaaccgg ggggatccgc ttgtagggga gtatccc 47 

<210> 94 
<211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer SSV3 
<400> 94 

gagcccggga caagcggaag cggtggtgga aaagagggaa ctgaacg 47 

<210> 95 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer SSV4 
<400> 95 

atcgctcgag tcagacccct tttagccatt ccg 33 



<210> 96 
<211> 40 
<212> DNA 
<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequence: primer SSV5 



5 



<400> 96 

atcgttcgaa ggatccgcgt taacacctaa gaaggcgaag 



40 



<210> 97 
<211> 38 
10 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer SSV6 

15 

<400> 97 

atcgttcgaa ggatccgatt taatctctac atggagag 38 



20 <210> 98 
<211> 64 
<212> DNA 

<213> Artificial Sequence 
25 <220> 

<223> Description of Artificial Sequence: primer C31-2 
<400> 98 

ataagaatgc ggccgcacca tgcccaagaa gaagaggaag gtgapacaag gggttgtgac 60 
30 cggg 64 



<210> 99 
<211> 4831 
35 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: vector pRK41 

40 

<400> 99 

ggccgcacca tgcccaagaa gaagaggaag gtgacacaag gggttgtgac cggggtggac 60 
acgtacgcgg gtgcttacga ccgtcagtcg cgcgagcgcg agaattcgag cgcagcaagc 120 
ccagcgacac agcgtagcgc caacgaagac aaggcggccg accttcagcg cgaagtcgag 180 

45 cgcgacgggg gccggttcag gttcgtcggg catttcagcg aagcgccggg cacgtcggcg 240 
ttcgggacgg cggagcgccc ggagttcgaa cgcatcctga acgaatgccg cgccgggcgg 300 
ctcaacatga tcattgtcta tgacgtgtcg cgcttctcgc gcctgaaggt catggacgcg 360 
attccgattg tctcggaatt gctcgccctg ggcgtgacga ttgtttccac tcaggaaggc 420 
gtcttccggc agggaaacgt catggacctg attcacctga ttatgcggct cgacgcgtcg 480 

50 cacaaagaat cttcgctgaa gtcggcgaag attctcgaca cgaagaacct tcagcgcgaa 540 
ttgggcgggt acgtcggcgg gaaggcgcct tacggcttcg agcttgtttc ggagacgaag 600 
gagatcacgc gcaacggccg aatggtcaat gtcgtcatca acaagcttgc gcactcgacc 660 
actcccctta ccggaccctt cgagttcgag cccgacgtaa tccggtggtg gtggcgtgag 720 
atcaagacgc acaaacacct tcccttcaag ccgggcagtc aagccgccat tcacccgggc 780 

55 agcatcacgg ggctttgtaa gcgcatggac gctgacgccg tgccgacccg gggcgagacg 8 40 
attgggaaga agaccgcttc aagcgcctgg gacccggcaa ccgttatgcg aatccttcgg 900 
gacccgcgta ttgcgggctt cgccgctgag gtgatctaca agaagaagcc ggacggcacg 960 
ccgaccacga agattgaggg ttaccgcatt cagcgcgacc cgatcacgct ccggccggtc 1020 
gagcttgatt gcggaccgat catcgagccc gctgagtggt atgagcttca ggcgtggttg 1080 

60 gacggcaggg ggcgcggcaa ggggctttcc cgggggcaag ccattctgtc cgccatggac 1140 
aagctgtact gcgagtgtgg cgccgtcatg acttcgaagc gcggggaaga atcgatcaag 1200 
gactcttacc gctgccgtcg ccggaaggtg gtcgacccgt ccgcacctgg gcagcacgaa 1260 
ggcacgtgca acgtcagcat ggcggcactc gacaagttcg ttgcggaacg catcttcaac 1320 
aagatcaggc acgccgaagg cgacgaagag acgttggcgc" ttctgtggga agccgcccga 1380 

65 cgcttcggca agctcactga g gc gcct gag aa gage ggcg aacgggcgaa ccttgttgcg 1440 
gagcgcgccg acgccctgaa cgcccttgaa gagctgtacg aagaccgcgc ggcaggcgcg 1500 
tacgaeggae ccgttggcag gaagcacttc eggaagcaac aggcageget gacgctccgg 1560 
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cagcaagggg 
cttgaccaat 
gggcgcgcgt 
gtcacgaagt 
acgtgggcga 
gtagcggcgt 
caagcttatc 
agtgagggtt 
gttatccgct 
gtgcctaatg 
cgggaaacct 
tgcgtattgg 
tgcggcgagc 
ataacgcagg 
ccgcgttgct 
gctcaagtca 
gaagctccct 
ttctcccttc 
tgtaggtcgt 
gcgccttatc 
tggcagcagc 
tcttgaagtg 
tgctgaagcc 
ccgctggtag 
ctcaagaaga 
gttaagggat 
aaaaatgaag 
aatgcttaat 
cctgactccc 
ctgcaatgat 
cagccggaag 
ttaattgttg 
ttgccattgc 
ccggttccca 
gctccttcgg 
ttatggcagc 
ctggtgagta 
gcccggcgtc 
ttggaaaacg 
cgatgtaacc 
ctgggtgagc 
aatgttgaat 
gtctcatgag 
gcacatttcc 
gttaaatttt 
ttataaatca 
tccactatta 
tggcccacta 
actaaatcgg 
cgtggcgaga 
agcggtcacg 
gtcccattcg 
gctattacgc 
agggttttce 
atagggcgaa 



cggaagagcg 
ggttccccga 
cagtagacga 
cgactacggg 
agccgccgac 
agcggccgct 
gataccgtcg 
aatttcgagc 
cacaattcca 
agtgagctaa 
gtcgtgccag 
gcgctcttcc 
ggtatcagct 
aaagaacatg 
ggcgtttttc 
gaggtggcga 
cgtgcgctct 
gggaagcgtg 
tcgctccaag 
cggtaactat 
cactggtaac 
gtggcctaac 
agttaccttc 
cggtggtttt 
tcctttgatc 
tttggtcatg 
ttttaaatca 
cagtgaggca 
cgtcgtgtag 
accgcgagac 
ggccgagcgc 
ccgggaagct 
tacaggcatc 
acgatcaagg 
tcctccgatc 
actgcataat 
ctcaaccaag 
aatacgggat 
ttcttcgggg 
cactcgtgca 
aaaaacagga 
actcatactc 
cggatacata 
ccgaaaagtg 
tgttaaatca 
aaagaataga 
aagaacgtgg 
cgtgaaccat 
aaccctaaag 
aaggaaggga 
ctgcgcgtaa 
ccattcaggc 
cagctggcga 
cagtcacgac 
ttggagctcc 



gcttgccgaa 
agacgccgac 
caagcgcgtg 
cagggggcag 
cgacgacgac 
ctagaactag 
acctcgaggg 
ttggcgtaat 
cacaacatac 
ctcacattaa 
ctgcattaat 
gcttcctcgc 
cactcaaagg 
tgagcaaaag 
cataggctcc 
aacccgacag 
cctgttccga 
gcgctttctc 
ctgggctgtg 
cgtcttgagt 
aggattagca 
tacggctaca 
ggaaaaagag 
tttgtttgca 
ttttctacgg 
agattatcaa 
atctaaagta 
cctatctcag 
ataactacga 
ccacgctcac 
agaagtggtc 
agagtaagta 
gtggtgtcac 
cgagttacat 
gttgtcagaa 
tctcttactg 
tcattctgag 
aataccgcgc 
cgaaaactct 
cccaactgat 
aggcaaaatg 
ttcctttttc 
tttgaatgta 
ccacctaaat 
gctcattttt 
ccgagatagg 
actccaacgt 
caccctaatc 
ggagcccccg 
agaaagcgaa 
ccaccacacc 
tgcgcaactg 
aagggggatg 
gttgtaaaac 
accgcggtgg 
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cttgaagccg 
gctgacccga 
ttcgtcgggc 
ggaacgccca 
gaagacgacg 
tggatccccc 
ggggcccggt 
catggtcata 
gagccggaag 
ttgcgttgcg 
gaatcggcca 
tcactgactc 
cggtaatacg 
gccagcaaaa 
gcccccctga 
gactataaag 
ccctgccgct 
atagctcacg 
tgcacgaacc 
ccaacccggt 
gagcgaggta 
ctagaaggac 
ttggtagctc 
agcagcagat 
ggtctgacgc 
aaaggatctt 
tatatgagta 
cgatctgtct 
tacgggaggg 
cggctccaga 
ctgcaacttt 
gttcgccagt 
gctcgtcgtt 
gatcccccat 
gtaagttggc 
tcatgccatc 
aatagtgtat 
cacatagcag 
caaggatctt 
cttcagcatc 
ccgcaaaaaa 
aatattattg 
tttagaaaaa 
tgtaagcgtt 
taaccaatag 
gttgagtgtt 
caaagggcga 
aagttttttg 
atttagagct 
aggagcgggc 
cgccgcgctt 
ttgggaaggg 
tgctgcaagg 
gacggccagt 



ccgaagcccc 
ccggccctaa 
tcttcgtaga 
tcgagaagcg 
cccaggacgg 
gggctgcagg 
acccagcttt 
gctgtttcct 
cataaagtgt 
ctcactgccc 
acgcgcgggg 
gctgcgctcg 
gttatccaca 
ggccaggaac 
cgagcatcac 
ataccaggcg 
taccggatac 
ctgtaggtat 
ccccgttcag 
aagacacgac 
tgtaggcggt 
agtatttggt 
ttgatccggc 
tacgcgcaga 
tcagtggaac 
cacctagatc 
aacttggtct 
atttcgttca 
cttaccatct 
tttatcagca 
atccgcctcc 
taatagtttg 
tggtatggct 
gttgtgcaaa 
cgcagtgtta 
cgtaagatgc 
gcggcgaccg 
aactttaaaa 
accgctgttg 
ttttactttc 
gggaataagg 
aagcatttat 
taaacaaata 
aatattttgt 
gccgaaatcg 
gttccagttt 
aaaaccgtct 
gggtcgaggt 
tgacggggaa 
gctagggcgc 
aatgcgccgc 
cgatcggtgc 
cgattaagtt 
gaattgtaat 



gaagcttccc 
gtcgtggtgg 
caagatcgtt 
cgcttcgatc 
cacggaagac 
aattcgatat 
tgttcccttt 
gtgtgaaatt 
aaagcctggg 
gctttccagt 
agaggcggtt 
gtcgttcggc 
gaatcagggg 
cgtaaaaagg 
aaaaatcgac 
tttccccctg 
ctgtccgcct 
ctcagttcgg 
cccgaccgct 
ttatcgccac 
gctacagagt 
atctgcgctc 
aaacaaacca 
aaaaaaggat 
gaaaactcac 
cttttaaatt 
gacagttacc 
tccatagttg 
ggccccagtg 
ataaaccagc 
atccagtcta 
cgcaacgttg 
tcattcagct 
aaagcggtta 
tcactcatgg 
ttttctgtga 
agttgctctt 
gtgctcatca 
agatccagtt 
accagcgttt 
gcgacacgga 
cagggttatt 
ggggttccgc 
taaaattcgc 
gcaaaatccc 
ggaacaagag 
atcagggcga 
gccgtaaagc 
agccggcgaa 
tggcaagtgt 
tacagggcgc 
gggcctcttc 
gggtaacgcc 
acgactcact 



1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4831 



<210> 100 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
C31-screen 1 

<400> 100 
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20 



<210> 101 
5 <211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

10 <223> Description of Artificial Sequence: primer 
C31-screen 2 



15 



25 



<400> 101 

gcagcggtaa gagtccttga t 21 



<210> 102 
<211> 20 
<212> DNA 
20 <213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequerice; primer 
beta-Gal 3 

<400> 102 

atcctctgca tggtcaggtc 20 



30 <210> 103 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
35 <220> 

<223> Description of Artificial Sequence: primer 
beta-Gal 4 

<400> 103 r 
40 cgtggcctga ttcattcc 18 



<210> 104 
<211> 5878 
45 <212> DNA 

<213> Artificial Secp^ence 

<220> 

<223> Description of Artificial Sequence: vector 
50 pCAGGS-Cre^pA 

<400> 104 

cgccgcgtgc ggcccgcgct gcccggcggc tgtgagcgct gcgggcgcgg cgcggggctt 60 

tgtgcgctcc gcgtgtgcgc gaggggagcg cggccggggg cggtgccccg cggtgcgggg 120 

55 gggctgcgag gggaacaaag gctgcgtgcg gggtgtgtgc gtgggggggt gagcaggggg 180 

tgtgggcgcg gcggtcgggc tgtaaccccc ccctgcaccc ccctccccga gtt get gage 240 

acggcccggc ttcgggtgcg gggctccgtg cggggcgtgg cgcggggctc gccgtgccgg 300 

gcggggggtg gcggcaggtg ggggtgccgg gcggggcggg gccgcctcgg geeggggagg 360 

geteggggga ggggegegge ggccccggag cgccggcggc tgtcgaggcg cggcgagccg 420 

60 cagccattgc cttttatggt aatcgtgcga gagggegcag ggacttcctt tgtcccaaat 480 

ctggcggagc cgaaatctgg gaggcgccgc cgcaccccct etagegggeg egggegaage 540 

ggtgcggcgc eggcaggaag gaaatgggcg gggagggect tegtgegteg ccgcgccgcc 600 

gtccccttct ccatctccag cctcggggct geegcagggg gacggctgcc ttcggggggg 660 

aeggggcagg gcggggttcg gcttctggcg tgtgaccggc ggctctagta agcgttgggg 720 

65 tgagtactcc ctctcaaaag egggcatgae ttctgegcta agattgtcag tttccaaaaa 780 

cgaggaggat ttgatattca cctggcccgc ggtgatgcct ttgagggtgg ccgcgtccat 840 

ctggtcagaa aagacaatct ttttgttgtc aagcttgagg tgtggcaggc ttgagatctg 900 
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gccatacact tgagtgacat tgacatccac 
cagggcggcc tcgaccatgc ccaagaagaa 
ccaaaatttg cctgcattac cggtcgatgc 
ggacatgttc agggatcgcc aggcgttttc 
5 ttgccggtcg tgggcggcat ggtgcaagtt 
tgaagatgtt cgcgattatc ttctatatct 
ccagcaacat ttgggccagc taaacatgct 
tgacagcaat gctgtttcac tggttatgcg 
tgaacgtgca aaacaggctc tagcgttcga 

10 catggaaaat agcgatcgct gccaggatat 
taacaccctg ttacgtatag ccgaaattgc 
tgacggtggg agaatgttaa tccatattgg 
tgtagagaag gcacttagcc tgggggtaac 
tggtgtagct gatgatccga ataactacct 

15 cgcgccatct gccaccagcc agctatcaac 
tcatcgattg atttacggcg ctaaggatga 
acacagtgcc cgtgtcggag ccgcgcgaga 
gatcatgcaa gctggtggct ggaccaatgt 
ggatagtgaa acaggggcaa tggtgcgcct 

20 taaatgattg cagatccact agttctaggg 
caataaaaga tcattatttt caatagatct 
gggggaggcc agaatgaggc gcggccaagg 
agggggaggc cagaatgacc ttgggggagg 
accgagctcg aattcactgg ccgtcgtttt 

25 tacccaactt aatcgccttg cagcacatcc 
ggcccgcacc gatcgccctt cccaacagtt 
gcggtatttt ctccttacgc atctgtgcgg 
tacaatctgc tctgatgccg catagttaag 
cgcgccctga cgggcttgtc tgctcccggc 

30 cgggagctgc atgtgtcaga ggttttcacc 
cctcgt.gata cgcctatttt tataggttaa 
aggtggcact tttcggggaa atgtgcgcgg 
ttcaaatatg tatccgctca tgagacaata 
aaggaagagt atgagtattc aacatttccg 

35 ttgccttcct gtttttgctc acccagaaac 
gttgggtgca cgagtgggtt acatcgaact 
ttttcgcccc gaagaacgtt ttccaatgat 
ggtattatcc cgtattgacg ccgggcaaga 
gaatgacttg gttgagtact caccagtcac 

40 aagagaatta tgcagtgctg ccataaccat 
gacaacgatc ggaggaccga aggagctaac 
aactcgcctt gatcgttggg aaccggagct 
caccacgatg cctgtagcaa tggcaacaac 
tactctagct tcccggcaac aattaataga 

45 acttctgcgc tcggcccttc cggctggctg 
gcgtgggtct cgcggtatca ttgcagcact 
agttafcctac acgacgggga gtcaggcaac 
gataggtgcc tcactgatta agcattggta 
ttagattgat ttaaaacttc atttttaatt 

50 taatctcatg accaaaatcc cttaacgtga 
agaaaagatc aaaggatctt cttgagatcc 
aacaaaaaaa ccaccgctac cagcggtggt 
ttttccgaag gtaactggct tcagcagagc 
gccgtagtta ggccaccact tcaagaactc 

55 aatcctgtta ccagtggctg ctgccagtgg 
aagacgatag ttaccggata aggcgcagcg 
gcccagcttg gagcgaacga cctacaccga 
aagcgccacg cttcccgaag ggagaaaggc 
aacaggagag cgcacgaggg agcttccagg 

60 cgggtttcgc cacctctgac ttgagcgtcg 
cctatggaaa aacgccagca acgcggcctt 
tgctcacatg ttctttcctg cgttatcccc 
tgagtgagct gataccgctc gccgcagccg 
ggaagcggaa gagcgcccaa tacgcaaacc 

65 atgcagctgg cacgacaggt ttcccgactg 
tgtgagttag ctcactcatt aggcacccca 
gttgtgtgga attgtgagcg gataacaatt 



85 

tttgcctttc tctccacagg tgtccactcc 960 
gaggaaggtg tccaatttac tgaccgtaca 1020 
aacgagtgat gaggttcgca agaacctgat 1080 
tgagcatacc tggaaaatgc ttctgtccgt 1140 
gaataaccgg aaatggtttc ccgcagaacc 1200 
tcaggcgcgc ggtctggcag taaaaactat 1260 
tcatcgtcgg tccgggctgc cacgaccaag 1320 
gcggatccga aaagaaaacg ttgatgccgg 1380 
acgcactgat ttcgaccagg ttcgttcact 1440 
acgtaatctg gcatttctgg ggattgctta 1500 
caggatcagg gttaaagata tctcacgtac 15 60 
cagaacgaaa acgctggtta gcaccgcagg 1620 
taaactggtc gagcgatgga tttccgtctc 1680 
gttttgccgg gtcagaaaaa atggtgttgc 1740 
tcgcgccctg gaagggattt ttgaagcaac 1800. 
ctctggtcag agatacctgg cctggtctgg 1860 
tatggcccgc gctggagttt caataccgga 1920 
aaatattgtc atgaactata tccgtaacct 1980 
gctggaagat ggcgattagc cattaacgcg 2040 
ccgcgtcgac ctcgagatcc aggcgcggat 2100 
gtgtgttggt ttfcttgtgtg ccttggggga 2160 
gggaggggga ggccagaatg accttggggg 2220 
gggaggccag aatgaggcgc gcccccgggt 2280 
acaacgtcgt gactgggaaa accctggcgt 2340 
ccctttcgcc agctggcgta atagcgaaga 24 00 
gcgcagcctg aatggcgaat ggcgcctgat 2460 
tatttcacac cgcatatggt gcactctcag 2520 
ccagccccga cacccgccaa cacccgctga 2580 
atccgcttac agacaagctg tgaccgtctc 2640 
gtcatcaccg aaacgcgcga gacgaaaggg 27 00 
tgtcatgata ataatggttt cttagacgtc 2760 
aacccctatt tgtttatttt tctaaataca 2820 
accctgataa atgcttcaat aatattgaaa 2880 
tgtcgccctt attccctttt ttgcggcatt 2940 
gctggtgaaa gtaaaagatg ctgaagatca 3000 
ggatctcaac agcggtaaga tccttgagag 30 60 
gagcactttt aaagttctgc tatgtggcgc 3120 
gcaactcggt cgccgcatac actattctca 3180 
agaaaagcat cttacggatg gcatgacagt 3240 
gagtgataac actgcggcca acttacttct 3300 
cgcttttttg cacaacatgg gggatcatgt 3360 
gaatgaagcc ataccaaacg acgagcgtga 3420 
gttgcgcaaa ctattaactg gcgaactact 3480 
ctggatggag gcggataaag ttgcaggacc 3540 
gtttattgct gataaatctg gagccggtga 3600 
ggggccagat ggtaagccct cccgtatcgt 3660 
tatggatgaa cgaaatagac agatcgctga 3720 
actgtcagac caagtttact catafcatact 3780 
taaaaggatc taggtgaaga tcctttttga 3840 
gttttcgttc cactgagcgt cagaccccgt 3900 
tttttttctg cgcgtaatct gctgcttgca 3960 
ttgtttgccg gatcaagagc taccaactct 4020 
gcagatacca aatactgtcc ttctagtgta 4080 
tgtagcaccg cctacatacc tcgctctgct 4140 
cgataagtcg tgtcttaccg ggttggactc 4200 
gtcgggctga acggggggtt cgtgcacaca 4260 
actgagatac ctacagcgtg agctatgaga 4320 
ggacaggtat ccggtaagcg gcagggtcgg 4380 
gggaaacgcc tggtatcttt atagtcctgt 4440 
atttttgtga tgctcgtcag gggggcggag 4500 
tttacggttc ctggcctttt gctggccttt 4560 
tgattctgtg gataaccgta ttaccgcctt 4620 
aacgaccgag cgcagcgagt cagtgagcga 4 680 
gcctctcccc gcgcgttggc cgattcatta 4740 
gaaagcgggc agtgagcgca acgcaattaa 4800 
ggctttacac tttatgcttc cggctcgtat 48 60 
tcacacagga aacagctatg accatgatta 4 920 
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cgccaagcta gcccgggcta gcttgcatgc ctgcaggttt tcgacattga ttattgacta 4980 

gttattaata gtaatcaatt acggggtcat tagttcatag cccatatatg gagttccgcg 5040 

ttacataact tacggtaaat ggcccgcctg gctgaccgcc caacgacccc cgcccattga 5100 

cgtcaataat gacgtatgtt cccatagtaa cgccaatagg gactttccat tgacgtcaat 5160 

5 gggtggacta tttacggtaa actgcccact tggca'gtaca tcaagtgtat catatgccaa 5220 

gtacgccccc tattgacgtc aatgacggta aatggcccgc ctggcattat gcccagtaca 5280 

tgaccttatg ggactttcct acttggcagt acatctacgt attagtcatc gctattacca 5340 

tgggtcgagg tgagccccac gttctgcttc actctcccca tctccccccc ctccccaccc 5400 

ccaattttgt atttatttat tttttaatta ttttgtgcag cgatgggggc gggggggggg 54 60 

10 ggggcgcgcg ccaggcgggg cggggcgggg cgaggggcgg ggcggggcga ggcggagagg 5520 

tgcggcggca gccaatcaga gcggcgcgct ccgaaagttt ccttttatgg cgaggcggcg 5580 

gcggcggcgg ccctataaaa agcgaagcgc gcggcgggcg ggagtcgctg cgttgccttc 564 0 

gccccgtgcc ccgctccgcg ccgcctcgcg ccgcccgccc cggctctgac tgaccgcgtt 5700 

actcccacag gtgagcgggc gggacggccc ttctcctccg ggctgtaatt agcgcttggt 57 60 

15 ttaatgacgg ctcgtttctt ttctgtggct gcgtgaaagc cttaaagggc tccgggaggg 5820 

ccctttgtgc gggggggagc ggctcggggg gtgcgtgcgt gtgtgtgtgc gtggggag 5878 



<210> 105 
20 <211> 6641 
<212> DNA 

<213> Artificial Sequence 
<220> 

25 <223> Description of Artificial Sequence: vector 
pCAGGSC31CNLS^pA 

<400> 105 

attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 60 

atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120 

acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180 

tccattgacg tcaatgggtg gactatttac ggtaaact go ccacttggca gtacatcaag 240 

tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300 

attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 360 

tcatcgctat taccatgggt cgaggtgagc cccacgttct gcttcactct ccccatctcc 420 

cccccctccc cacccccaat tttgtattta tttatttttt aattattttg tgcagcgatg 480 

ggggcggggg gggggggggc gcgcgccagg cggggcgggg cggggcgagg ggcggggcgg 540 

ggcgaggcgg agaggtgcgg cggcagccaa tcagagcggc gcgctccgaa agtttccttt 600 

tatggcgagg cggcggcggc ggcggcccta taaaaagcga agcgcgcggc gggcgggagt 660 

cgctgcgttg ccttcgcccc gtgccccgct ccgcgccgcc tcgcgccgcc cgccccggct 720 

ctgactgacc gcgttactcc cacaggtgag cgggcgggac ggcccttctc ctccgggctg 780 

taattagcgc ttggtttaat gacggctcgt ttcttttctg tggctgcgtg aaagccttaa 840 

agggctccgg gagggccctt tgtgcggggg ggagcggctc ggggggtgcg tgcgtgtgtg 900 

tgtgcgtggg gagcgccgcg tgcggcccgc gctgcccggc ggctgtgagc gctgcgggcg 960 

cggcgcgggg ctttgtgcgc tccgcgtgtg cgcgagggga gcgcggccgg gggcggtgcc 1020 

ccgcggtgcg ggggggctgc gaggggaaca aaggctgcgt gcggggtgtg tgcgtggggg 1080 

ggtgagcagg gggtgtgggc gcggcggtcg ggctgtaacc cccccctgca cccccctccc 1140 

cgagttgctg agcacggccc ggcttcgggt gcggggctcc gtgcggggcg tggcgcgggg 1200 

ctcgccgtgc cgggcggggg gtggcggcag gtgggggtgc cgggcggggc ggggccgcct 1260 

cgggccgggg agggctcggg ggaggggcgc ggcggccccg gagcgccggc ggctgtcgag 1320 

gcgcggcgag ccgcagccat tgccttttat ggtaatcgtg cgagagggcg cagggacttc 1380 

ctttgtccca aatctggcgg agccgaaatc tgggaggcgc cgccgcaccc cctctagcgg 144 0 

gcgcgggcga agcggtgcgg cgccggcagg aaggaaatgg gcggggaggg ccttcgtgcg 1500 

tcgccgcgcc gccgtcccct tctccatctc cagcctcggg gctgccgcag ggggacggct 1560 

gccttcgggg gggacggggc agggcggggt tcggcttctg gcgtgtgacc ggcggctcta 1620 

gtaagcgttg gggtgagtac tccctctcaa aagcgggcat gacttctgcg ctaagattgt 1680 

cagtttccaa aaacgaggag gatttgatat tcacctggcc cgcggtgatg cctttgaggg 174 0 

tggccgcgtc catctggtca gaaaagacaa tctttttgtt gtcaagcttg aggtgtggca 1800 

ggcttgagat ctggccatac acttgagtga cattgacatc cactttgcct ttctctccac 1860 

aggtgtccac tcccagggcg gccgcccgat atgacacaag gggttgtgac cggggtggac 1920 

acgtacgcgg gtgcttacga ccgtcagtcg cgcgagcgcg agaattcgag cgcagcaagc 1980 

ccagcgacac agcgtagcgc caacgaagac aaggcggccg accttcagcg cgaagtcgag 2040 

cgcgacgggg gccggttcag gttcgtcggg catttcagcg aagcgccggg cacgtcggcg 2100 

ttcgggacgg cggagcgccc ggagttcgaa cgcatcctga acgaatgccg cgccgggcgg 2160 

ctcaacatga tcattgtcta tgacgtgtcg cgcttctcgc gcctgaaggt catggacgcg 2220 

attccgattg tctcggaatt gctcgccctg ggcgtgacga ttgtttccac tcaggaaggc 2280 

gtcttccggc agggaaacgt catggacctg attcacctga ttatgcggct cgacgcgtcg 2340 
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cacaaagaat cttcgctgaa gtcggcgaag 
ttgggcgggt acgtcggcgg gaaggcgcct 
gagatcacgc gcaacggccg aatggtcaat 
actcccctta ccggaccctt cgagttcgag 
5 atcaagacgc acaaacacct tcccttcaag 
agcatcacgg ggctttgtaa gcgcatggac 
attgggaaga agaccgcttc aagcgcctgg 
gacccgcgta ttgcgggctt cgccgctgag 
ccgaccacga agattgaggg ttaccgcatt 

10 gagcttgatt gcggaccgat catcgagccc 
gacggcaggg ggcgcggcaa ggggctttcc 
aagctgtact gcgagtgtgg cgccgtcatg 
gactcttacc gctgccgtcg ccggaaggtg 
ggcacgtgca acgtcagcat ggcggcactc 

IS aagatcaggc acgccgaagg cgacgaagag 
cgcttcggca agctcactga ggcgcctgag 
gagcgcgccg acgccctgaa cgcccttgaa 
tacgacggac ccgttggcag gaagcacttc 
cagcaagggg cggaagagcg gcttgccgaa 

20 cttgaccaat ggttccccga agacgccgac 
gggcgcgcgt cagtagacga caagcgcgtg 
gtcacgaagt cgactacggg cagggggcag 
acgtgggcga agccgccgac cgacgacgac 
gtagcggcgc ctaagaagaa gaggaaggtt 

25 aaaagatcat tattttcaat agatctgtgt 
gaggccagaa tgaggcgcgg ccaaggggga 
ggaggccaga atgaccttgg gggaggggga 
agctcgaatt cactggccgt cgttttacaa 
caacttaatc gccttgcagc acatccccct 

30 . cgcaccgatc gcccttccca acagttgcgc 
tattttctcc ttacgcatct gtgcggtatt 
atctgctctg atgccgcata gttaagccag 
ccctgacggg cttgtctgct cccggcatcc 
agctgcatgt gtcagaggtt ttcaccgtca 

35 gtgatacgcc tatttttata ggttaatgtc 
ggcacttttc ggggaaatgt gcgcggaacc 
aatatgtatc cgctcatgag acaataaccc 
aagagtatga gtattcaaca tttccgtgtc 
cttcctgttt ttgctcaccc agaaacgctg 

40 ggtgcacgag tgggttacat cgaactggat 
cgccccgaag aacgttttcc aatgatgagc 
ttatcccgta ttgacgccgg gcaagagcaa 
gacttggttg agtactcacc agtcacagaa 
gaattatgca gtgctgccat aaccatgagt 

45 acgatcggag gaccgaagga gctaaccgct 
cgccttgatc gttgggaacc ggagctgaat 
acgatgcctg tagcaatggc aacaacgttg 
ctagcttccc ggcaacaatt aatagactgg 
ctgcgctcgg cccttccggc tggctggttt 

50 gggtctcgcg gtatcattgc agcactgggg 
atctacacga cggggagtca ggcaactatg 
ggtgcctcac tgattaagca ttggtaactg 
attgatttaa aacttcattt ttaatttaaa 
ctcatgacca aaatccctta acgtgagttt 

55 aagatcaaag gatcttcttg agatcctttt 
aaaaaaccac cgctaccagc ggtggtttgt 
ccgaaggtaa ctggcttcag cagagcgcag 
tagttaggcc accacttcaa gaactctgta 
ctgttaccag tggctgctgc cagtggcgat 

60 cgatagttac cggataaggc gcagcggtcg 
agcttggagc gaacgaccta caccgaactg 
gccacgcttc ccgaagggag aaaggcggac 
ggagagcgca cgagggagct tccaggggga 
tttcgccacc tctgacttga gcgtcgattt 

65 tggaaaaacg ccagcaacgc ggccttttta 
cacatgttct ttcctgcgtt atcccctgat 
tgagctgata ccgctcgccg cagccgaacg 
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attctcgaca cgaagaacct tcagcgcgaa 2400 
tacggcttcg agcttgtttc ggagacgaag 24 60 
gtcgtcatca acaagcttgc gcactcgacc 2520 
cccgacgtaa tccggtggtg gtggcgtgag 2580 
ccgggcagtc aagccgccat tcacccgggc 2640 
gctgacgccg tgccgacccg gggcgagacg 2700 
gacccggcaa ccgttatgcg aatccttcgg 27 60 
gtgatctaca agaagaagcc ggacggcacg 2820 
cagcgcgacc cgatcacgct ccggccggtc 2880 
gctgagtggt atgagcttca ggcgtggttg 2940 
cgggggcaag ccattctgtc cgccatggac 3000 
acttcgaagc gcggggaaga atcgatcaag 3060 
gtcgacccgt ccgcacctgg gcagcacgaa 3120 
gacaagttcg ttgcggaacg catcttcaac 3180, 
acgttggcgc ttctgtggga agccgcccga 3240 
aagagcggcg aacgggcgaa ccttgttgcg 3300 
gagctgtacg aagaccgcgc ggcaggcgcg 3360 
cggaagcaac aggcagcgct gacgctccgg 3420 
cttgaagccg ccgaagcccc gaagcttccc 3480 
gctgacccga ccggccctaa gtcgtggtgg 3540 
ttcgtcgggc tcttcgtaga caagatcgtt 3600 
ggaacgccca tcgagaagcg cgcttcgatc 3660 
gaagacgacg cccaggacgg cacggaagac 3720 
tagactctcg agatccaggc gcggatcaat 3780 
gttggttttt tgtgtgcctt gggggagggg 3840 
gggggaggcc agaatgacct tgggggaggg 3900 
ggccagaatg aggcgcgccc ccgggtaccg 3960 
cgtcgtgact gggaaaaccc tggcgttacc 4020 
ttcgccagct ggcgtaatag cgaagaggcc 4080 
agcctgaatg gcgaatggcg cctgatgcgg 4140 
tcacaccgca tatggtgcac tctcagtaca 4200 
cccc^acacc cgccaacacc cgctgacgcg 4260 
gcttacagac aagctgtgac cgtctccggg 4320 
tcaccgaaac gcgcgagacg aaagggcctc 4380 
atgataataa tggtttctta gacgtcaggt 44 40 
cctatttgtt tatttttcta aatacattca 4500 
tgataaatgc ttcaataata ttgaaaaagg 4560 
gcccttattc ccttttttgc ggcattttgc 4 620 
gtgaaagtaa aagatgctga agatcagttg 4 680 
ctcaacagcg gtaagatcct tgagagtttt 4740 
acttttaaag ttctgctatg tggcgcggta 4800 
ctcggtcgCc gcatacacta ttctcagaat 4B60 
aagcatctta cggatggcat gacagtaaga 4920 
gataacactg cggccaactt acttctgaca 4980 
tttttgcaca acatggggga tcatgtaact 5040 
gaagccatac caaacgacga gcgtgacacc 5100 
cgcaaactat taactggcga actacttact 5160 
atggaggcgg ataaagttgc aggaccactt 5220 
attgctgata aatctggagc cggtgagcgt 5280 
ccagatggta agccctcccg tatcgtagtt 5340 
gatgaacgaa atagacagat cgctgagata 5400 
tcagaccaag tttactcata tatactttag . 54 60 
aggatctagg tgaagatcct ttttgataat 5520 
tcgttccact gagcgtcaga ccccgtagaa 5580 
tttctgcgcg taatctgctg cttgcaaaca 5640 
ttgccggatc aagagctacc aactcttttt 5700 
ataccaaata ctgtccttct agtgtagccg 5760 
gcaccgccta catacctcgc tctgctaatc 5820 
aagtcgtgtc ttaccgggtt ggactcaaga 5880 
ggctgaacgg ggggttcgtg cacacagccc 5940 
agatacctac agcgtgagct atgagaaagc 6000 
aggtatccgg taagcggcag ggtcggaaca 6060 
aacgcctggt atctttatag tcctgtcggg 6120 
ttgtgatgct cgtcaggggg gcggagccta 6180 
cggttcctgg ccttttgctg gccttttgct 6240 
tctgtggata accgtattac cgcctttgag 6300 
accgagcgca gcgagtcagt gagcgaggaa 6360 
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gcggaagagc gcccaatacg caaaccgcct ctccccgcgc gttggccgat tzcattaatgc 
agctggcacg acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgtg 
agttagctca ctcattaggc accccaggct ttacacttta tgcttccggc tcgtatgttg 
tgtggaattg tgagcggata acaatttcac acaggaaaca gctatgacca tgattacgcc 
5 aagctagccc gggctagctt gcatgcctgc aggttttcga c 



<210> 106 
<211> 11784 
10 <212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: modified 
15 ROSA26 locus 

<400> 106 

ggcaggccct ccgagcgtgg tggagccgtt ctgtgagaca gccgggtacg agtcgtgacg 60 
ctggaagggg caagcgggtg gtgggcagga atgcggtccg ccctgcagca accggagggg 120 

20 gagggagaag ggagcggaaa agtctccacc ggacgcggcc atggctcggg gggggggggg 180 
cagcggagga gcgcttccgg ccgacgtctc gtcgctgatt ggcttctttt cctcccgccg 240 
tgtgtgaaaa cacaaatggc gtgttttggt tggcgtaagg cgcctgtcag ttaacggcag 300 
ccggagtgcg cagccgccgg cagcctcgct ctgcccactg ggtggggcgg gaggtaggtg 360 
gggtgaggcg agctggacgt gcgggcgcgg tcggcctctg gcggggcggg ggaggggagg 420 

25 gagggtcagc gaaagtagct cgcgcgcgag cggccgccca ccctcccctt cctctggggg 4 80 
agtcgtttta cccgccgccg gccgggcctc gtcgtctgat tggctctcgg ggcccagaaa 540 
actggccctt gccattggct cgtgttcgtg caagttgagt ccatccgccg gccagcgggg 600 
gcggcgagga ggcgctccca ggttccggcc ctcccctcgg ccccgcgccg cagagtctgg 660 
ccgcgcgccc ctgcgcaacg tggcaggaag cgcgcgctgg gggcggggac gggcagtagg 720 

30 gctgagcggc tgcggggcgg gtgcaagcac gtttccgact tgagttgcct caagaggggc 780 
gtgctgagcc agacctccat cgcgcactcc ggggagtgga gggaaggagc gagggctcag 840 
ttgggctgtt ttggaggcag gaagcacttg ctctcccaaa gtcgctctga gttgttatca 900 
gtaagggagc tgcagtggag taggcgggga gaaggccgca cccttctccg gaggggggag 960 
gggagtgttg caataccttt ctgggagttc tctgctgcct cctggcttct gaggaccgcc 1020 

35 ctgggcctgg gagaatccct tccccctctt ccctcgtgat ctgcaactcc agtctttcgc 1080 
ctaggtaacc gatatccctg caggggtgac ctgcacgtct agggcgcagt agtccagggt 1140 
ttccttgatg atgtcatact tatcctgtcc cttttttttc cacagctcgc ggttgaggac 1200 
aaactcttcg cggtctttcc agtactcctg caggtgactg actgagtcga cgacactgca 1260 
gagacctact tcactaacaa ccggtacagt tcgtggacca gatgggtgag gtggagtacg 1320 

40 cgcccgggga gcccaagggc acgccctggc acccgcaccg cggcttcgag accgtcacga 1380 
ataacttcgt atagcataca ttatacgaag ttataagctc gatgaattct accgggtagg 1440 
ggaggcgctt ttcccaaggc agtctggagc atgcgcttta gcagccccgc tggcacttgg 1500 
cgctacacaa gtggcctctg gcctcgcaca cattccacat ccaccggtag bgccaaccgg 1560 
ctccgttctt tggtggcccc ttcgcgccac cttctactcc tcccctagtc aggaagttcc 1620 

45 cccccgcccc gcagctcgcg tcgtgcagga cgtgacaaat ggaagtagca cgtctcacta 1680 
gtctcgtgca gatggacagc accgctgagc aatggaagcg ggtaggcctt tggggcagcg 1740 
gccaatagca gctttgctcc ttcgctttct gggctcagag gctgggaagg ggtgggtccg 1800 
ggggcgggct caggggcggg ctcaggggcg gggcgggcgc gaaggtcctc ccgaggcccg 18 60 
. gcattctcgc acgcttcaaa agcgcacgtc tgccgcgctg ttctcctctt cctcatctcc 1920 

50 gggcctttcg acgatccagc cgccaccatg aaaaagcctg aactcaccgc gacgtctgtc I960 
gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc tgatgcagct ctcggagggc 2040 
gaagaatctc gtgctttcag cttcgatgta ggagggcgtg gatatgtcct gcgggtaaat 2100 
agctgcgccg atggtttcta caaagatcgt tatgtttatc ggcactttgc atcggccgcg 2160 
ctcccgattc cggaagtgct tgacattggg gaattcagcg agagcctgac ctattgcatc 2220 

55 tcccgccgtg cacagggtgt cacgttgcaa gacctgcctg aaaccgaact gcccgctgtt 2280 
ctgcagccgg tcgcggaggc catggatgcg atcgctgcgg ccgatcttag ccagacgagc 2340 
gggttcggcc cattcggacc gcaaggaatc ggtcaataca ctacatggcg tgatttcata 2400 
tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg tgatggacga caccgtcagt 24 60 
gcgtccgtcg cgcaggctct cgatgagctg atgctttggg ccgaggactg ccccgaagtc 2520 

60 cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc tgacggacaa tggccgcata 2580 
acagcggtca ttgactggag cgaggcgatg ttcggggatt cccaatacga ggtcgccaac 2640 
atcttcttct ggaggccgtg gttggcttgt atggagcagc agacgcgcta cttcgagcgg 2700 
aggcatccgg agcttgcagg atcgccgcgg ctccgggcgt atatgctccg cattggtctt 2760 
gaccaactct atcagagctt ggttgacggc aatttcgatg atgcagcttg ggcgcagggt 2820 

65 cgatgcgacg caatcgtccg atccggagcc gggactgtcg ggcgtacaca aatcgcccgc 2880 
agaagcgcgg ccgtctggac cgatggctgt gtagaagtac tcgccgatag tggaaaccga 2940 
cgccccagca ctcgtccgag ggcaaaggaa tagtcgatgc agaaattgat gatctattaa 3000 
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6540 
6600 
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acaataaaga tgtccactaa aatggaagtt 
gaacagagta cctacatttt gaatggaagg 
gattagataa atgcctgctc tttactgaag 
catagttgga tatcataatt taaacaagca 
5 cactcatgat ctatagatct atagatctct 
actttgtggt tctaagtact gtggtttcca 
gatcagcagc ct.ctgttcca catacacbtc 
tccatcagaa gcttcagctg ctcgactaga 
aggttttact tgctttaaaa aacctcccac 

10 atgcaattgt tgttgttaac ttgtttattg 
gcatcacaaa tttcacaaat aaagcatttt 
aactcatcaa tgtatcttat catgtctgga 
aactgagaga actcaaaggt taccccagtt 
tccataactt cgtatagcat acattatacg 

15 agcttggcac tggccgtcgt tttacaacgt 
cttaatcgcc ttgcagcaca tccccctttc 
accgatcgcc cttcccaaca gttgcgcagc 
ccggcaccag aagcggtgcc ggaaagctgg 
gtcgtcgtcc cctcaaactg gcagatgcac 

20 acctatccca ttacggtcaa tccgccgttt 
. tcgctcacat ttaatgttga tgaaagctgg 
gatggcgtta actcggcgtt tcatctgtgg 
gacagtcgtt tgccgtctga atttgacctg 
ctcgcggtga tggtgctgcg ttggagtgac 

25 cggatgagcg gcattttccg tgacgtctcg 
gatttccatg ttgccactcg ctttaatgat 
gttcagatgt gcggcgagtt gcgtgactac 
gaaacgcagg tcgccagcgg caccgcgcct 
ggttatgccg atcgcgtcac actacgtctg 

30 gaaatcccga atctctatcg tgcggtggtt 
gaagcagaag cctgcgatgt cggtttccgc 
ctgaacggca agccgttgct gattcgaggc 
ggtcaggtca tggatgagca gacgatggtg 
tttaacgccg tgcgctgttc gcattatccg 

35 cgctacggcc tgtatgtggt ggatgaagcc 
aatcg-fcctga ccgatgatcc gcgctggcta 
gtgcagcgcg atcgtaatca cccgagtgtg 
cacggcgcta atcacgacgc gctgt.atcgc 
gtgcagtatg aaggcggcgg agccgacacc 

40 gcgcgcgtgg atgaagacca gcccttcccg 
ctttcgctac ctggagagac gcgccpgctg 
aacagtcttg gcggtttcgc taaatactgg 
ggcggcttcg tctgggactg ggtggatcag 
ccgtggtcgg cttacggcgg tgattttggc 

45 aacggtctgg tctttgccga ccgcacgccg 
cagcagtttt tccagttccg tttatccggg 
ttccgtcata gcgataacga gctcctgcac 
gcaagcggtg aagtgcctct ggatgtcgct 
gaactaccgc agccggagag cgccgggcaa 

50 aacgcgaccg catggtcaga agccgggcac 
gaaaacctca gtgtgacgct ccccgccgcg 
gaaatggatt tttgcatcga gctgggtaat 
tttctttcac agatgtggat tggcgataaa 
ttcacccgtg caccgctgga taacgacatt 

55 aacgcctggg tcgaacgctg gaaggcggcg 
cagtgcacgg cagatacact tgctgatgcg 
catcagggga aaaccttatt tatcagccgg 
atggcgatta ccgttgatgt tgaagtggcg 
ctgaactgcc agctggcgca ggtagcagag 

60 gaaaactatc ccgaccgcct tactgccgcc 
gacatgtata ccccgtacgt cttcccgagc 
ttgaattatg gcccacacca gtggcgcggc 
caacagcaac tgatggaaac cagccatcgc 
ctgaatatcg acggtttcca tatggggatt 

65 tcggcggaat tccagctgag cgccggtcgc 
taataataac cgggcagggg ggatctttgt 
attggacaaa ctacctacag agatttaaag 
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tttcctgtca tactttgtta agaagggtga 3060 
attggagcta cgggggtggg ggtggggtgg 3120 
gctctttact attgctttat gataatgttt 3180 
aaaccaaatt aagggccagc tcattcctcc 3240 
cgtgggatca ttgtttttct cttgattccc 3300 
aatgtgtcag tttcatagcc tgaagaacga 3360 
attctcagta ttgttttgcc aagttctaat 3420 
ggatcataat cagccatacc acatttgtag 3480 
acctccccct gaacctgaaa cataaaatga 3540 
cagcttataa tggttacaaa taaagcaata -3600 
tttcactgca ttctagttgt ggtttgtcca 3660 
tccgtgtcat gtcggcgacc ctacgccccc 3720 
ggggcactac tcccgaaaac cgcttctgga 3780 
aagttatacc gggccaccat ggtcgcgagt 3840. 
cgtgactggg aaaaccctgg cgttacccaa 3900 
gccagctggc gtaatagcga agaggcccgc 3960 
ctgaatggcg aatggcgctt tgcctggttt 4020 
ctggagtgcg atcttcctga ggccgatact 4080 
ggttacgatg cgcccatcta caccaacgta 4140 
gttcccacgg agaatccgac gggttgttac 4200 
ctacaggaag gccagacgcg aattattttt 4260 
tgcaacgggc gctgggtcgg ttacggccag 4320 
agcgcatttt tacgcgccgg agaaaaccgc 4380 
ggcagttatc tggaagatca ggatatgtgg 4440 
ttgctgcata aaccgactac acaaatcagc 4500 
gatttcagcc gcgctgtact ggaggctgaa 4560 
ctacgggtaa cagtttcttt atggcagggt 4 620 
ttcggcggtg aaattatcga tgagcgtggt 4 680 
aacgtcgaaa acccgaaact gtggagcgcc 4740 
gaactgcaca ccgccgacgg cacgctgatt 4800 
gaggtgcgga ttgaaaatgg tctgctgctg 4860 
gttaaccgtc acgagcatca tcctctgcat 4 920 
caggatatcc tgctgatgaa gcagaacaac 4 980 
aaccatccgc tgtggtacac gctgtgcgac 5040 
aatattgaaa cccacggcat ggtgccaatg 5100 
ccggcgatga gcgaacgcgt aacgcgaatg 5160 
atcatctggt cgctggggaa tgaatcaggc 5220 
tggatcaaat ctgtcgatcc ttcccgcccg 5280 
acggccaccg atattatttg cccgatgtac 5340 
gctgtgccga aatggtccat caaaaaatgg 5400 
atcctttgcg aatacgccca cgcgatgggt. 5460 
caggcgtttc gtcagtatcc ccgtttacag 5520 
tcgctgatta aatatgatga aaacggcaac 5580 
gatacgccga acgatcgcca gtrtctgtatg 5640 
catccagcgc tgacggaagc aaaacaccag 5700 
caaaccatcg aagtgaccag cgaatacctg 5760 
tggatggtgg cgctggatgg taagccgctg 5820 
ccacaaggta aacagttgat tgaactgcct 5880 
ctctggctca cagtacgcgt agtgcaaccg 5940 
atcagcgcct ggcagcagtg gcgtctggcg 6000 
tcccacgcca tcccgcatct gaccaccagc 6060 
aagcgttggc aatttaaccg ccagtcaggc 6120 
aaacaactgc tgacgccgct gcgcgatcag 6180 
ggcgtaagtg aagcgacccg cattgaccct 6240 
ggccattacc aggccgaagc agcgttgttg 6300 
gtgctgatta cgaccgctca cgcgtggcag 6360 
aaaacctacc ggattgatgg tagtggtcaa 6420 
agcgatacac cgcatccggc gcggattggc 6480 
cgggtaaact ggctcggatt agggccgcaa 6540 
tgttttgacc gctgggatct gccattgtca 6600 
gaaaacggtc tgcgctgcgg gacgcgcgaa 6660 
gacttccagt tcaacatcag ccgctacagt 6720 
catctgctgc acgcggaaga aggcacatgg 6780 
ggtggcgacg actcctggag cccgtcagta 684 0 
taccattacc agttggtctg gtgtcaaaaa 6900 
gaaggaacct tacttctgtg gtgtgacata 6960 
ctctaaggta aatataaaat ttttaagtgt 7020 
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ataatgtgtt aaactactga ttctaattgt 
tgatgaatgg gagcagtggt ggaatgccag 
ttggacaaac cacaactaga atgcagtgaa 
ctattgcttt atttgtaacc attataagct 
5 ttcattttat gtttcaggtt cagggggagg 
tctacaaatg tggtatggct gattatgatc 
ggtaaccgaa gttcctatac tttctaga,ga 
taagcgctag cctagaagat gggcgggagt 
gtgtgggcgt tgtcctgcag gggaattgaa 

10 cacagatttt cggttttgtc gggaagtttt 
ataggtagtc atctggggtt ttatgcagca 
cctcggagta ttttccatcg aggtagatta 
ctgcttgaga tccttactac agtatgaaat 
gaattttaat catttttaaa gagcccagta 

15 agccttatca aaaggtattt tagaacactc 
gcttatccaa cccctagaca gagcattggc 
tgactcatga aaccagacag attagttaca 
ctcaacactg cagttctttt ataactcctt 
tccttaattt tcagtgtcta tcacctctcc 

20 ctcagtccag ggagttttac aacaatagat 
tccactccca tgaatgcctc tctccttttt 
aatggttcca ggtggatgtc tcctccccat 
ctgatatttt aagacattaa aaggtatatt 
gcttactaaa attttgtcat tgtacacatc 

25 gttcaggtgt ttgttgtctt tcctgaccta 
aagcagtgct ttctcttgga ctggcttgac 
aaatgtgatt ttgccaagct tcttcaggac 
caagtaaaat gattaagcaa caaatgtatt 
gtgtgtgctt gtgctctata ataatactat 

30 agagcacaga ctgctcttcc agaagtcctg 
cacaaccatc tgtaatggga tctgatgccc 
attcacatta aataaataaa tcctccttct 
tgtctccagt agaatttact gaagtaatga 
caataatcaa attactcttt aagcactgga 

35 agtgtaactg tggacagagg agccataact 
agactttaat gtcttttctc ttacactaag 
atcctatttg tttaaactgc tagctttact 
aaagctaagt ctgcagccat tactaaacat 
aaaatgtagg gccagagttt agccagccag 

40 cagcactctg gaggcagaga caggcagatc 
tcaagttcta tctaggatag ccaggaatac 
tgagatttca taaaattata attgaagcat 
atccgtctac ctttctgatg agatttgggt 
gtcttttgac actgtgggct ttctttaaag 

45 ctactaactt cccatggctt aaatggcatg 
atttgcagcc tgatttccag ggtggggttg 
taattttttt tttaaaaaat gggttatata 
aggtggacta atattaaatg agtccctccc 
tatacttaac ttttttttta aatgtggtat 

50 atacagaaac tgttgcatcg cttaatcaga 
ttcttcacag ccaaagtcaa attaagaatt 
gaatataaaa atgatagctt ttcctgaggc 
gcaacaagat atgtagacta aagttctgcc 
atgtagtaat acttttggaa cttgcaggtc 

55 gcttgggtga tagttggtaa aatgtgtttc 
caacctactt tttaaaaaaa aaagccaggc 
acttgctgag cacacaagag tagttacttg 
aacaaggcag acaaccaaga aactacagtt 
tacacaggga tattaaaata ttccaaataa 

60 gggacatgga tttctccggt gaataggcag 
gatttgtgaa attgttttca agtgatagtt 
atttcgaggt ctcttggttt atactcagaa 
gatcatgtgc taggcctacc ttaggctgat 
aggtgatgtc atatgatttc atatatcaag 

65 tacttaatgt gaaagttagg tctttgtggg 
aataagtcat ttttacatgt cttacatttg 
agactctctg acctagtaac cctacctata 
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ttgtgtattt tagattccaa cctatggaac 7080 
atccagacat gataagatac attgatgagt 7140 
aaaaatgctt tatttgtgaa atttgtgatg 7200 
gcaataaaca agttaacaac aacaattgca 7260 
tgtgggaggt tttttaaagc aagtaaaacc 7320 
tgcggccaaa tcggccggcc taggcgcgcc 7380 
ataggaactt cggaatagga acttcaagct 7 440 
cttctgggca ggcttaaagg ctaacctggt 7500 
caggtgtaaa attggaggga caagactt cc 7560 
ttaatagggg caaataagga aaatgggagg 7 620 
aaactacagg ttattattgc ttgtgatccg 7 680 
aagacatgct cacccgagtt ttatactctc 7740 
tacagtgtcg cgagttagac tatgtaagca 7800 
cttcatatcc atttcfccccg ctccttctgc 78 60 
attttagccc cattttcatt tattatactg 7920 
attttccctt tcctgatctt agaagtctga 7 980 
tacaccacaa atcgaggctg tagctggggc 8040 
agtacacttt ttgttgatcc tttgccttga 8100 
cgtcagtggt gttccacatt tgggcctatt 8160 
gtattgagaa tccaacctaa agcttaactt 8220 
ctccatttat aaactgagcfc attaaccatt 8280 
attacctgat gtatcttaca tattgccagg 8340 
tcattattga gccacatggt attgattact 8400 
tgtaaaaggt ggttcctttt ggaatgcaaa 84 60 
aggtcttgtg agcttgtatt ttttctattt 8520 
tcatggcatt ctacacgtta ttgctggtct 8580 
ctataatttt gcttgacttg tagccaaaca 8 64 0 
tgtgaagctt ggtttttagg ttgttgtgtt 8700 
ccaggggctg gagaggtggc tcggagttca 8760 
agttcaattc ccagcaacca catggtggct 8820 
tcttctggtg tgtctgaaga ccacaagtgt 8880 
tcttcttttt ttttttttta aagagaatac 8 940 
aatactttgt gtttgttcca atatggtagc 9000 
aatgttacca aggaactaat ttttatttga 9060 
gcagacttgt gggatacaga agaccaatgc 9120 
caataaagaa ataaaaattg aacttctagt 9180 
taacttttgt gcttcatcta tacaaagctg 924 0 
gaaagcaagt aatgataatt ttggatttca 9300 
tggtggtgct tgcctttatg cctttaatcc 9360 
tctgagtttg agcccagcct ggtctacaca 9420 
acacagaaac cctgttgggg aggggggctc 9480 
tccctaatga gccactatgg atgtggctaa 9540 
attatttttt ctgtctctgc tgttggttgg 9600 
cctccttcct gccatgtggt ctcttgtttg 9660 
gctttttgcc ttctaagggc agctgctgag 9720 
ggaaatcttt caaacactaa aattgtcctt 9780 
ataaacctca taaaatagtt atgaggagtg 98 40 
ctataaaaga gctattaagg ctttttgtct 9900 
ctttagaacc aagggtctta gagttttagt 9960 
ttttctagtt tcaaatccag agaatccaaa 10020 
tctgactttt aatgttaatt tgcttactgt 10080 
agggtctcac tatgtatctc tgcctgatct 10140 
tgcttttgtc tcctgaatac taaggttaaa 10200 
agattctttt ataggggaca cactaaggga 10260 
aagtgatgaa aacttgaatt attatcaccg 10320 
ctgttagagc atgcttaagg gatccctagg 10380 
gcaggctcct ggtgagagca tatttcaaaa 10440 
aaggttacct gtctttaaac catctgcata 10500 
tatttcattc aagttttccc ccatcaaatt 10560 
agttggaaac taaacaaatg ttggttttgt 10620 
aaagcccatg agatacagaa caaagctgct 10680 
gcacttcttt gggtttccct gcactatcct 10740 
tgttgttcaa ataaacttaa gtttcctgtc 10800 
gcaaaacatg ttatatatgt taaacatttg 108 60 
tttgattttt aattttcaaa acctgagcta 10920 
gtggaattgt ataattgtgg tttgcaggca 10980 
gagcactttg ctgggtcaca agtctaggag 11040 
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tcaagcattt 
ggacatgttt 
gattagcact 
tctcaaataa 
taatttttgt 
aagccatatt 
tatagccctg 
ccgcctgcct 
gttggatatt 
cagtcagtag 
cagaggctgt 
gggtcaggga 
tctgatagaa 



caccttgaag 
atccagaaga 
gtt agt gage 
tgctggcctt 
tcaaagaaat 
ttttttcctt 
gctgtcctgg 
ctgcctcctg 
ttgttatata 
tcttaagtgg 
tggtactagt 
tagaaactag 
atatttcagg 



ttgagacgtt 
tattcaggac 
attgagtggc 
ttttaaaaag 
acttgtttgg 
tttttttttt 
aactcacttt 
agtgccggga 
actataacca 
tctttattgg 
ggcacttaag 
tctagcgttt 
a cat 
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ttgttagtgt 
tatttttgac 
ctttaggctt 
cccttgttct 
atctcctttt 
tttttggttt 
gtagaccagg 
ttaaaggcgt 
atactaactc 
cccttcatta 
caacttccta 
tgtataccta 



atactagttt 
tgggctaagg 
gaattggagt 
ttatcaccct 
gacaacaata 
ttcgagacag 
ctggcctcga 
gcaccaccac 
cactgggtgg 
aaatctactg 
cggatatact 
ccagctttat 



atatgttgga 
aattgattct 
cacttgtata 
gttttctaca 
gcatgttttc 
ggtttctctg 
actcagaaat 
gcctggctaa 
atttttaatt 
ttcactctaa 
agcagattaa 
actaccttgt 



11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11784 



<210> 107 
<211> 1458 
<212> DNA 

<213> Bacteriophage TP901-1 

<220> 

<221> CDS 

<222> <1)--U455) 



<400> 107 

atg act aag aaa gta gca ate tat aca cga gta tec act act aac caa 

Met Thr Lys Lys Val Ala lie Tyr Thr Arg Val Ser Thr Thr Asn Gin 
15 10 15 



48 



gca gag gaa ggg ttc tea att gat gag caa att gac cgt tta aca aaa 
Ala Glu Glu Gly Phe Ser lie Asp Glu Gin lie Asp Arg Leu Thr Lys 
20 25 30 



96 



tat get gaa gca atg ggg tgg caa gta tct gat act tat act gat get 
Tyr Ala Glu Ala Met Gly Trp Gin Val Ser Asp Thr Tyr Thr Asp Ala 
35 40 45 



144 



ggt ttt tea ggg gec aaa ctt gaa cgc cca gca atg caa aga tta ate 
Gly Phe Ser Gly Ala Lys Leu Glu Arg Pro Ala Met Gin Arg Leu lie 
50 55 60 - 



192 



aac gat ate gag aat aaa get ttt gat aca gtt ctt gta tat aag eta 
Asn Asp lie Glu Asn Lys Ala Phe Asp Thr Val Leu Val Tyr Lys Leu 
65 70 75 80 



240 



gac cgc ctt tea cgt agt gta aga gat act ctt tat ctt gtt aag gat 
Asp Arg Leu Ser Arg Ser Val Arg Asp Thr Leu Tyr Leu Val Lys Asp 
85 90 95 



288 



gtg ttc aca aaa aat aaa ata gac ttt ate teg ctt aat gaa agt att 
Val Phe Thr Lys Asn Lys He Asp Phe He Ser Leu Asn Glu Ser He 
100 105 110 



336 



gat act tct tct get atg ggt age ttg ttt etc act att ctt tct gca 
Asp Thr Ser Ser Ala Met Gly Ser Leu Phe Leu Thr He Leu Ser Ala 
115 120 125 



384 



att aat gag ttt gaa aga gag aat ata aaa gaa cgc atg act atg ggt 
He Asn Glu Phe Glu Arg Glu Asn He Lys Glu Arg Met Thr Met Gly 
130 135 140 



432 



aaa eta ggg cga gcg aaa tct ggt aag tct atg atg tgg act aag aca 
Lys Leu Gly Arg Ala Lys Ser Gly Lys Ser Met Met Trp Thr Lys Thr 
145 150 155 160 



480 



get ttt ggg tat tac cac aac aga aag aca ggt ata tta gaa att gtt 
Ala Phe Gly Tyr Tyr His Asn Arg Lys Thr Gly He Leu Glu He Val 



528 
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165 170 175 

cct tta caa get aca ata gtt gaa caa at a ttc act gat tat tta tea 576 
Pro Leu Gin Ala Thr lie Val Glu Gin He Phe Thr Asp Tyr Leu Ser 
5 180 185 190 

gga ata tea ctt aca aaa tta aga gat aaa etc aat gaa tct gga cac 624 
Gly lie Ser Leu Thr Lys Leu Arg Asp Lys Leu Asn Glu Ser Gly His 
195 200 ~ 205 

10 

ate ggt aaa gat ata ccg tgg tct tat cgt acc eta aga caa aca ctt 672 
He Gly Lys Asp He Pro Trp Ser Tyr Arg Thr Leu Arg Gin Thr Leu 
210 215 A ~ 220 

15 gat aat cca gtt tac tgt ggt tat ate aaa ttt aag gac age eta ttt 720 
Asp Asn Pro Val Tyr Cys Gly Tyr He Lys Phe Lys Asp Ser Leu Phe 
225 230 235 240 

gaa ggt atg cac aaa cca att ate cct tat gag act tat tta aaa gtt 7 68 
20 Glu Gly Met His Lys Pro He He Pro Tyr Glu Thr Tyr Leu Lys Val 

245 250 255 

caa aaa gag eta gaa gaa aga caa cag cag act tat gaa aga aat aac 816 . 
Gin Lys Glu Leu Glu Glu Arg Gin Gin Glh Thr Tyr Glu Arg Asn Asn 
25 260 265 270 

aac cct aga cct ttc caa get aaa tat atg ctg tea ggg atg gca agg 864 

Asn Pro Arg Pro Phe Gin Ala Lys Tyr Met Leu Ser Gly Met Ala Arg 

275 280 285 

30 

tgc ggt tac tgt gga gca cct tta aaa att gtt ctt ggc cac aaa aga 912 

Cys Gly Tyr Cys Gly Ala Pro Leu Lys He Val Leu Gly His Lys Arg 
290 295 300 

35 aaa gat gga age cgc act atg aaa tat cac tgt gca aat aga ttt cct 960 
Lys Asp Gly Ser Arg Thr Met Lys Tyr His Cys Ala Asn Arg Phe Pro 
305 ~ 310 315 320 

cga aaa aca aaa gga att aca gta tat aat gac aat aaa aag tgt gat 1008 
40 Arg Lys Thr Lys Gly He Thr Val Tyr Asn Asp Asn Lys Lys Cys Asp 

325 330 335 

tea gga act tat gat tta agt aat tta gaa aat act gtt att gac aac 1056 
Ser Gly Thr Tyr Asp Leu Ser Asn Leu Glu Asn Thr Val He Asp Asn 
45 340 345 350 

ctg att gga ttt caa gaa aat aat gac tec tta ttg aaa att ate aat 1104 

Leu He Gly Phe Gin Glu Asn Asn Asp Ser Leu Leu Lys He lie Asn 

355 360 365 

50 

ggc aac aac caa cct att ctt gat act teg tea ttt aaa aag caa att 1152 

Gly Asn Asn Gin Pro He Leu Asp Thr Ser Ser Phe Lys Lys Gin lie 

370 375 380 

55 tea cag ate gat aaa aaa ata caa aag aac tct gat ttg tac eta aat 1200 
Ser Gin He Asp Lys Lys He Gin Lys Asn Ser Asp Leu Tyr Leu Asn 
385 390 395 400 

gat ttt ate act atg gat gag ttg aaa- gat cgt act gat tec ctt cag 1248 
60 Asp Phe He Thr Met Asp Glu Leu Lys Asp Arg Thr Asp Ser Leu Gin 

405 410 415 

get gag aaa aag ctg ctt aaa get aag att age gaa aat aaa ttt aat 1296 
Ala Glu Lys Lys Leu Leu Lys Ala Lys He Ser Glu Asn Lys Phe Asn 
65 420 " 425 430 

gac tct act gat gtt ttt gag tta gtt aaa act cag ttg ggc tea att 1344 
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Asp Ser Thr Asp Val Phe Glu Leu Val Lys Thr Gin Leu Gly Ser lie 
435 440 445 

ccg att aat gaa eta tea tat gat aat aaa aag aaa ate gtc aac aac 1392 
Pro lie Asn Glu Leu Ser Tyr Asp Asn Lys Lys Lys lie Val Asn Asn 
450 455 4 60 

ctt gta tea aag gtt gat gtt act get gat aat gta gat ate at a ttt 1440 
Leu Val Ser Lys Val Asp Val Thr Ala Asp Asn Val Asp lie lie Phe 
465 470 475 480 

aaa ttc caa etc get taa 1458 
Lys Phe Gin Leu Ala 
485 

<210> 108 
<211> 485 
<212> PRT 

<213> Bacteriophage TP901-1 
<400> 108 

Met Thr Lys Lys Val Ala lie Tyr Thr Arg Val Ser Thr Thr Asn Gin 
15 10 15 

Ala Glu Glu Gly Phe Ser lie Asp Glu Gin lie Asp Arg Leu Thr Lys 
20 25 ~ 30 

Tyr Ala Glu Ala Met Gly Trp Gin Val Ser Asp Thr Tyr Thr Asp Ala 
35 40 45 

Gly Phe Ser Gly Ala Lys Leu Glu Arg Pro Ala Met Gin Arg Leu lie 
50 ~ " 55 60 

Asn Asp lie Glu Asn Lys Ala Phe Asp Thr Val Leu Val Tyr Lys Leu 
65 70 75 80 

Asp Afg Leu Ser Arg Ser Val Arg Asp Thr Leu Tyr Leu Val Lys Asp 
85 90 95 

Val Phe Thr Lys Asn Lys lie Asp Phe lie Ser Leu Asn Glu Ser lie 
100 105 110 

Asp Thr Ser Ser Ala Met Gly Ser Leu Phe Leu Thr lie Leu Ser Ala 
115 "" 120 125 

lie Asn Glu Phe Glu Arg Glu Asn lie Lys Glu Arg Met Thr Met Gly 
130 135 140 

Lys Leu Gly Arg Ala Lys Ser Gly Lys Ser Met Met Trp Thr Lys Thr 
145 ^ 150 " 155 160 

Ala Phe Gly Tyr Tyr His Asn Arg Lys Thr Gly lie Leu Glu He Val 
165 170 " 175 

Pro Leu Gin Ala Thr He Val Glu Gin He Phe Thr Asp Tyr Leu Ser 
180 185 ~ 190 

Gly He Ser Leu Thr Lys Leu Arg Asp Lys Leu Asn Glu Ser Gly His 
195 200 205 

He Gly Lys Asp He Pro Trp Ser Tyr Arg Thr Leu Arg Gin Thr Leu 
210 * 215 220 

Asp Asn Pro Val Tyr Cys Gly Tyr He Lys Phe Lys Asp Ser Leu Phe 
225 230 235 240 
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Glu Gly Met His Lys Pro lie lie Pro Tyr Glu Thr Tyr Leu Lys Val 
245 250 255 

Gin Lys Glu Leu Glu Glu Arg Gin Gin Gin Thr Tyr Glu Arg Asn Asn 
260 265 270 

Asn Pro Arg Pro Phe Gin Ala Lys Tyr Met Leu Ser Gly Met Ala Arg 
275 280 285 

Cys Gly Tyr Cys Gly Ala Pro Leu Lys lie Val Leu Gly His Lya Arg 
290 295 300 

Lys Asp Gly Ser Arg Thr Met Lys Tyr His Cys Ala Asn Arg Phe Pro 
305 310 315 320 

Arg Lys Thr Lys Gly He Thr Val Tyr Asn Asp Asn Lys Lys Cys Asp 
325 ' 330 335 

Ser Gly Thr Tyr Asp Leu Ser Asn Leu Glu Asn Thr Val lie Asp Asn 
340 345 350 

Leu lie Gly Phe Gin Glu Asn Asn Asp Ser Leu Leu Lys He lie Asn 
355 360 365 

Gly Asn Asn Gin Pro He Leu Asp Thr Ser Ser Phe Lys Lys Gin lie 
370 375 380 

Ser Gin He Asp Lys Lys He Gin Lys Asn Ser Asp Leu Tyr Leu Asn 
385 390 395 400 

Asp Phe He Thr Met Asp Glu Leu Lys Asp Arg Thr Asp Ser Leu Gin 
405 ^ 410 415 

Ala Glu Lys Lys Leu Leu Lys Ala Lys He Ser Glu Asn Lys Phe Asn 
420 425 430 

Asp Ser Thr Asp Val Phe Glu Leu Val Lys Thr Gin Leu Gly Ser lie 
435 440 445 

Pro He Asn Glu Leu Ser Tyr Asp Asn Lys Lys Lys He Val Asn Asn 
450 455 460 

Leu Val Ser Lys Val Asp Val Thr Ala Asp Asn Val Asp He He Phe 
465 470 475 480 

Lys Phe Gin Leu Ala 
485 
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CO 



o 

AS 

CD 

O 
tr* 

X 

s! 


34±18 


164±54 
443 ±151 


163 ±36 
500 + 65 


755 ±601 | 


906 ±316 
879 + 291 


694±345 
874 ±741 


RLU (Luciferase) 


3631 598 ± 90301 2 


2741 969 ± 667568 
3798872 ±1288020 


2471695 ±611351 
3570103 ± 750628 


195822 ±81858 


11 9043 ±67451 
122557 ± 30054 


174380 ±58876 
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